US20230232052A1 - Machine learning techniques for detecting surges in content consumption - Google Patents
Machine learning techniques for detecting surges in content consumption Download PDFInfo
- Publication number
- US20230232052A1 US20230232052A1 US18/168,440 US202318168440A US2023232052A1 US 20230232052 A1 US20230232052 A1 US 20230232052A1 US 202318168440 A US202318168440 A US 202318168440A US 2023232052 A1 US2023232052 A1 US 2023232052A1
- Authority
- US
- United States
- Prior art keywords
- ccm
- inobs
- inob
- content
- user
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000010801 machine learning Methods 0.000 title claims description 161
- 238000000034 method Methods 0.000 title description 288
- 230000003993 interaction Effects 0.000 claims abstract description 104
- 239000013598 vector Substances 0.000 claims description 263
- 230000015654 memory Effects 0.000 claims description 60
- 238000012549 training Methods 0.000 claims description 58
- 230000006399 behavior Effects 0.000 claims description 44
- 230000008520 organization Effects 0.000 description 55
- 238000004891 communication Methods 0.000 description 48
- 238000004422 calculation algorithm Methods 0.000 description 46
- 230000008569 process Effects 0.000 description 46
- 238000013528 artificial neural network Methods 0.000 description 42
- 238000003860 storage Methods 0.000 description 42
- 238000012545 processing Methods 0.000 description 35
- 238000005516 engineering process Methods 0.000 description 29
- 230000006870 function Effects 0.000 description 29
- 230000000694 effects Effects 0.000 description 27
- 238000009826 distribution Methods 0.000 description 26
- 230000009471 action Effects 0.000 description 25
- 210000002569 neuron Anatomy 0.000 description 23
- 238000001514 detection method Methods 0.000 description 22
- 230000001413 cellular effect Effects 0.000 description 18
- 235000014510 cooky Nutrition 0.000 description 17
- 230000004044 response Effects 0.000 description 17
- 230000033001 locomotion Effects 0.000 description 16
- 230000005540 biological transmission Effects 0.000 description 15
- 230000009193 crawling Effects 0.000 description 15
- 230000008859 change Effects 0.000 description 14
- 238000004590 computer program Methods 0.000 description 14
- 238000013473 artificial intelligence Methods 0.000 description 13
- 238000003058 natural language processing Methods 0.000 description 13
- 238000013515 script Methods 0.000 description 11
- 230000003542 behavioural effect Effects 0.000 description 9
- 230000006872 improvement Effects 0.000 description 9
- 230000002093 peripheral effect Effects 0.000 description 9
- 239000003795 chemical substances by application Substances 0.000 description 8
- 239000002131 composite material Substances 0.000 description 8
- 238000007477 logistic regression Methods 0.000 description 8
- 230000003287 optical effect Effects 0.000 description 8
- 230000002829 reductive effect Effects 0.000 description 8
- 230000008685 targeting Effects 0.000 description 8
- 238000004458 analytical method Methods 0.000 description 7
- 238000004364 calculation method Methods 0.000 description 7
- 230000007423 decrease Effects 0.000 description 7
- 230000018109 developmental process Effects 0.000 description 7
- 238000005259 measurement Methods 0.000 description 7
- 230000006855 networking Effects 0.000 description 7
- 102100035353 Cyclin-dependent kinase 2-associated protein 1 Human genes 0.000 description 6
- 238000013145 classification model Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 6
- 230000001976 improved effect Effects 0.000 description 6
- 238000007726 management method Methods 0.000 description 6
- 238000012544 monitoring process Methods 0.000 description 6
- 238000012360 testing method Methods 0.000 description 6
- 238000012546 transfer Methods 0.000 description 6
- 210000004556 brain Anatomy 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 5
- 238000011161 development Methods 0.000 description 5
- 239000000284 extract Substances 0.000 description 5
- 230000000977 initiatory effect Effects 0.000 description 5
- 238000013507 mapping Methods 0.000 description 5
- 230000002085 persistent effect Effects 0.000 description 5
- 238000011160 research Methods 0.000 description 5
- 239000007787 solid Substances 0.000 description 5
- 230000003442 weekly effect Effects 0.000 description 5
- 238000003491 array Methods 0.000 description 4
- 239000000872 buffer Substances 0.000 description 4
- 210000004027 cell Anatomy 0.000 description 4
- 230000001276 controlling effect Effects 0.000 description 4
- 230000002354 daily effect Effects 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 4
- 238000001914 filtration Methods 0.000 description 4
- 230000014509 gene expression Effects 0.000 description 4
- 230000002452 interceptive effect Effects 0.000 description 4
- 230000000670 limiting effect Effects 0.000 description 4
- 230000007774 longterm Effects 0.000 description 4
- 230000005291 magnetic effect Effects 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 238000010606 normalization Methods 0.000 description 4
- 230000000306 recurrent effect Effects 0.000 description 4
- 238000012706 support-vector machine Methods 0.000 description 4
- 230000009466 transformation Effects 0.000 description 4
- 238000000844 transformation Methods 0.000 description 4
- 230000000007 visual effect Effects 0.000 description 4
- 238000012935 Averaging Methods 0.000 description 3
- 230000006978 adaptation Effects 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 3
- 230000001427 coherent effect Effects 0.000 description 3
- 238000003066 decision tree Methods 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 230000003203 everyday effect Effects 0.000 description 3
- 230000004424 eye movement Effects 0.000 description 3
- 230000001537 neural effect Effects 0.000 description 3
- 230000036961 partial effect Effects 0.000 description 3
- 230000001737 promoting effect Effects 0.000 description 3
- 238000007637 random forest analysis Methods 0.000 description 3
- 230000002787 reinforcement Effects 0.000 description 3
- 230000011218 segmentation Effects 0.000 description 3
- 239000004065 semiconductor Substances 0.000 description 3
- IRLPACMLTUPBCL-KQYNXXCUSA-N 5'-adenylyl sulfate Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](COP(O)(=O)OS(O)(=O)=O)[C@@H](O)[C@H]1O IRLPACMLTUPBCL-KQYNXXCUSA-N 0.000 description 2
- 235000008694 Humulus lupulus Nutrition 0.000 description 2
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical group [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 2
- 230000004931 aggregating effect Effects 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 2
- 238000004220 aggregation Methods 0.000 description 2
- 238000007635 classification algorithm Methods 0.000 description 2
- 238000011960 computer-aided design Methods 0.000 description 2
- 230000006378 damage Effects 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 235000019800 disodium phosphate Nutrition 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 239000004744 fabric Substances 0.000 description 2
- 239000000446 fuel Substances 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 230000001939 inductive effect Effects 0.000 description 2
- 238000009434 installation Methods 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 239000007788 liquid Substances 0.000 description 2
- 238000013178 mathematical model Methods 0.000 description 2
- 229910052751 metal Inorganic materials 0.000 description 2
- 239000002184 metal Substances 0.000 description 2
- 239000013307 optical fiber Substances 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000005192 partition Methods 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 230000000737 periodic effect Effects 0.000 description 2
- 238000000513 principal component analysis Methods 0.000 description 2
- 238000011084 recovery Methods 0.000 description 2
- 239000010979 ruby Substances 0.000 description 2
- 229910001750 ruby Inorganic materials 0.000 description 2
- 230000006403 short-term memory Effects 0.000 description 2
- 238000013179 statistical model Methods 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 210000000225 synapse Anatomy 0.000 description 2
- 238000012800 visualization Methods 0.000 description 2
- 238000010792 warming Methods 0.000 description 2
- LBUIKNKQOBTCII-BLLLJJGKSA-N (4r,6s)-5-(4-chlorophenyl)sulfonyl-4,6-diethyl-1,4,6,7-tetrahydropyrazolo[4,3-c]pyridine Chemical compound N1([C@H](CC)C2=CNN=C2C[C@@H]1CC)S(=O)(=O)C1=CC=C(Cl)C=C1 LBUIKNKQOBTCII-BLLLJJGKSA-N 0.000 description 1
- 235000017166 Bambusa arundinacea Nutrition 0.000 description 1
- 235000017491 Bambusa tulda Nutrition 0.000 description 1
- 241001330002 Bambuseae Species 0.000 description 1
- PGLIUCLTXOYQMV-UHFFFAOYSA-N Cetirizine hydrochloride Chemical compound Cl.Cl.C1CN(CCOCC(=O)O)CCN1C(C=1C=CC(Cl)=CC=1)C1=CC=CC=C1 PGLIUCLTXOYQMV-UHFFFAOYSA-N 0.000 description 1
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 102100035855 Cytosolic 5'-nucleotidase 1B Human genes 0.000 description 1
- 241000408659 Darpa Species 0.000 description 1
- 101710121996 Hexon protein p72 Proteins 0.000 description 1
- 101000802746 Homo sapiens Cytosolic 5'-nucleotidase 1B Proteins 0.000 description 1
- WHXSMMKQMYFTQS-UHFFFAOYSA-N Lithium Chemical compound [Li] WHXSMMKQMYFTQS-UHFFFAOYSA-N 0.000 description 1
- HBBGRARXTFLTSG-UHFFFAOYSA-N Lithium ion Chemical compound [Li+] HBBGRARXTFLTSG-UHFFFAOYSA-N 0.000 description 1
- 101710125418 Major capsid protein Proteins 0.000 description 1
- 101100408383 Mus musculus Piwil1 gene Proteins 0.000 description 1
- 235000015334 Phyllostachys viridis Nutrition 0.000 description 1
- 101000961042 Pseudopleuronectes americanus Ice-structuring protein A Proteins 0.000 description 1
- 241000414697 Tegra Species 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000003416 augmentation Effects 0.000 description 1
- 239000011425 bamboo Substances 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 210000000078 claw Anatomy 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 229910052802 copper Inorganic materials 0.000 description 1
- 239000010949 copper Substances 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013497 data interchange Methods 0.000 description 1
- 238000013523 data management Methods 0.000 description 1
- 238000013499 data model Methods 0.000 description 1
- 238000013502 data validation Methods 0.000 description 1
- 238000013503 de-identification Methods 0.000 description 1
- 230000006837 decompression Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011982 device technology Methods 0.000 description 1
- 238000001152 differential interference contrast microscopy Methods 0.000 description 1
- 238000005183 dynamical system Methods 0.000 description 1
- 238000003708 edge detection Methods 0.000 description 1
- 229920001746 electroactive polymer Polymers 0.000 description 1
- 230000005670 electromagnetic radiation Effects 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 239000000796 flavoring agent Substances 0.000 description 1
- 235000019634 flavors Nutrition 0.000 description 1
- 239000011888 foil Substances 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 238000003064 k means clustering Methods 0.000 description 1
- 239000004816 latex Substances 0.000 description 1
- 238000012417 linear regression Methods 0.000 description 1
- 229910052744 lithium Inorganic materials 0.000 description 1
- 229910001416 lithium ion Inorganic materials 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 239000011664 nicotinic acid Substances 0.000 description 1
- 210000004205 output neuron Anatomy 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000013439 planning Methods 0.000 description 1
- 229920000642 polymer Polymers 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 230000001012 protector Effects 0.000 description 1
- APTZNLHMIGJTEW-UHFFFAOYSA-N pyraflufen-ethyl Chemical compound C1=C(Cl)C(OCC(=O)OCC)=CC(C=2C(=C(OC(F)F)N(C)N=2)Cl)=C1F APTZNLHMIGJTEW-UHFFFAOYSA-N 0.000 description 1
- ZLIBICFPKPWGIZ-UHFFFAOYSA-N pyrimethanil Chemical compound CC1=CC(C)=NC(NC=2C=CC=CC=2)=N1 ZLIBICFPKPWGIZ-UHFFFAOYSA-N 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 239000002096 quantum dot Substances 0.000 description 1
- 230000005855 radiation Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000000611 regression analysis Methods 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 238000013468 resource allocation Methods 0.000 description 1
- 238000011012 sanitization Methods 0.000 description 1
- 230000001568 sexual effect Effects 0.000 description 1
- 229910001285 shape-memory alloy Inorganic materials 0.000 description 1
- 229910052710 silicon Inorganic materials 0.000 description 1
- 239000010703 silicon Substances 0.000 description 1
- 239000004984 smart glass Substances 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 230000007480 spreading Effects 0.000 description 1
- 238000003892 spreading Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 230000001502 supplementing effect Effects 0.000 description 1
- 230000008093 supporting effect Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000010809 targeting technique Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012384 transportation and delivery Methods 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
- 238000013024 troubleshooting Methods 0.000 description 1
- 230000005641 tunneling Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 238000007794 visualization technique Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/231—Content storage operation, e.g. caching movies for short term storage, replicating data over plural servers, prioritizing data for deletion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0241—Advertisements
- G06Q30/0242—Determining effectiveness of advertisements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/104—Peer-to-peer [P2P] networks
- H04L67/1044—Group management mechanisms
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/14—Session management
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/25—Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
- H04N21/251—Learning process for intelligent management, e.g. learning user preferences for recommending movies
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/535—Tracking the activity of the user
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Business, Economics & Management (AREA)
- Finance (AREA)
- Development Economics (AREA)
- Strategic Management (AREA)
- Accounting & Taxation (AREA)
- Computer Networks & Wireless Communication (AREA)
- Databases & Information Systems (AREA)
- Computing Systems (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Economics (AREA)
- General Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Mathematical Physics (AREA)
- Game Theory and Decision Science (AREA)
- Data Mining & Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present disclosure describes a content consumption monitor (CCM) that determines surges in content consumption based on changes in content consumptions scores. The CCM determines the content consumptions scores for domains and/or organizations (orgs) based on session events generated by different devices/users from the org and/or domain, a number of events generated by the org/domain, content and/or user interactions with the content indicated by the events, relevancy scores of the content to one or more topics, and/or other criteria. The CCM detects surges in consumption or interest in a topic for the domain/org when the consumption score reaches a threshold and/or within a period of time. The CCM may adjust the consumption score based on the changes in the relevancy, number of events and/or the number of users over different time periods. Other embodiments may be described and/or claimed.
Description
- The present application is a continuation of U.S. application Ser. No. 17/189,073 filed on Mar. 1, 2022, which is a continuation-in-part (CIP) of U.S. application Ser. No. 14/981,529 filed on Dec. 28, 2015, which is a CIP of U.S. application Ser. No. 14/498,056 filed Sep. 26, 2014 now issued as U.S. Pat. No. 9,940,634, the contents of each of which are hereby incorporated by reference in their entireties and for all purposes.
- Embodiments described herein generally relate to machine learning (ML) and artificial intelligence (AI), and in particular, ML/AI techniques for associating network addresses with locations from which content and/or information objects is/are accessed.
- Users receive a random variety of different information from a random variety of different businesses. For example, users may constantly receive promotional announcements, advertisements, information notices, event notifications, and/or the like. Users request some of this information. For example, a user may register on a company website to receive sales or information announcements. However, much of the information is of little or no interest to the user. For example, the user may receive emails announcing every upcoming seminar, regardless of the subject matter.
- The user may also receive unsolicited information. For example, a user may register on a website to download a white paper on a particular subject. A lead service then may sell the email address to companies that send the user unsolicited advertisements. Users end up ignoring most or all of these emails since most of the information has no relevance or interest. Alternatively, the user directs all of these emails into a junk email folder.
-
FIG. 1 depicts an example content consumption monitor (CCM).FIG. 2 depicts an example of the CCM in more detail.FIG. 3 depicts an example operation of a CCM tag.FIG. 4 depicts example events processed by the CCM.FIG. 5 depicts an example user intent vector.FIG. 6 depicts an example process for segmenting users.FIG. 7 depicts an example process for generating organization (org) intent vectors. -
FIG. 8 depicts an example consumption score generator.FIG. 9 depicts the example consumption score generator in more detail.FIG. 10 depicts an example process for identifying a surge in consumption scores.FIG. 11 depicts an example process for calculating initial consumption scores.FIG. 12 depicts an example process for adjusting the initial consumption scores based on historic baseline events. -
FIG. 13 depicts an example process for mapping surge topics with contacts.FIG. 14 depicts an example content consumption monitor calculating content intent.FIG. 15 depicts an example process for adjusting a consumption score based on content intent. -
FIG. 16 depicts an example network address classification system (NACS).FIG. 17 depicts an example of the NACS ofFIG. 16 in more detail.FIG. 18 depicts an example of how the content consumption monitor uses a network address entity map to generate consumption data.FIG. 19 depicts an example feature dataset generated by the NACS.FIG. 20 depicts an example process for generating the feature dataset inFIG. 19 .FIG. 21 depicts example organization features/characteristics (FORG) generated by the NACS. -
FIG. 22 depicts an example site classifier.FIG. 23 depicts an example process for resource classification.FIG. 24 depicts an example CCM that uses a site classifier.FIG. 25 depicts an example of how an event processor translates raw events into hostname events.FIG. 26 depicts an example of how the event processor generates web resource interest features from the hostname events.FIGS. 27, 28, 29 show example processes for determining various web resource interest features. -
FIG. 30 depicts an example of how an event processor identifies surges in the website feature ratios.FIG. 31 a depicts an example of how the event processor calculates a resource cluster interest score.FIG. 31 b depicts an example of how the event processor calculates a topic cluster interest score.FIG. 32 depicts an example of how the event processor calculates a weighted intent score.FIG. 33 depicts an example of how the event processor identifies a surge in the weighted intent score. -
FIG. 34 depicts an example structural semantic network graph for resources or information objects.FIG. 35 depicts example features generated for the information objects ofFIG. 34 .FIG. 36 depicts example vector embeddings generated for the features ofFIG. 35 .FIG. 37 depicts an example machine learning (ML) model trained using the vector embeddings ofFIG. 36 .FIG. 38 depicts an example ML model configured to classify resources based on associated vector embeddings. -
FIG. 39 depicts an example computing system suitable for practicing various aspects of the various embodiments discussed herein.FIG. 40 depicts an example neural network (NN). - Companies may research topics on the Internet as a prelude to purchasing items or services related to the topics. In embodiments, a content consumption monitor (CCM) generates consumption scores identifying the level of company interest in different topics. The CCM may go beyond just identifying companies interested in specific topics and also identify surge data indicating when the companies are most receptive to direct contacts regarding different topics. Service providers and/or publishers may use the surge data to increase interest in published information. In one example, the service providers and/or publishers may include advertisers who use the surge data to increase advertising conversion rates.
- The present disclosure describes web resource interest detection services. In particular, these services detect or otherwise determine web resource interest at the organization level. Here, a machine learning (ML) classification system uses various ML techniques to determine interest in a particular web resource (e.g., websites, webpages, web apps, and/or the like) based on actions taken by users from or otherwise associated with different organizations (orgs). In embodiments, an entity predictor predicts entities association with different network addresses (e.g., IP addresses) indicated by a set of obtained network events. A hostname extractor extracts a hostname of accessed information objects from the set of obtained network events. A feature generator generates different features based on the extracted hostnames, predicted entities, and/or other information included in the obtained network events. These feature are then used to predict an interest level in the accessed information objects and/or the hostname org by the predicted entity. Other embodiments may be described and/or claimed.
- The CCM may use these classifications and/or predictions to generate consumption scores and/or surge scores/signals. The embodiments discussed herein allow the CCM to generate more accurate intent data than existing/conventional solutions by better predicting intent and/or interest levels for specific orgs. The CCM uses processing resources more efficiently by generating more accurate consumption scores and/or surge scores/signals. The CCM may also provide more secure network analytics by generating consumption scores and/or surge scores/signals for orgs without using personally identifiable information (PII), sensitive data, and/or confidential data, thereby improving information security for end-users.
- The resource interest classifications and predictions and/or intent predictions can be used to more efficiently process events, more accurately calculate consumption scores, and more accurately detect associated surges such as organization (org) surges (also referred to as “company surges” or the like). The more accurate intent data and consumptions scores allow third party service providers to conserve computational and network resources by providing a means for better targeting users so that unwanted and seemingly random content is not distributed to users that do not want such content. This is a technological improvement in that it conserves network and computational resources of organizations (orgs) that distribute this content by reducing the amount of content generated and sent to end-user devices. Network resources may be reduced and/or conserved at end-user devices by reducing or eliminating the need for using resources to receive unwanted content, and computational resources may be reduced and/or conserved at end-user devices by reducing or eliminating the need to implement spam filters and/or reducing the amount of data to be processed when analyzing and/or deleting such content. This amounts to an improvement in the technological fields of machine learning and web tracking technologies, and also amounts to an improvement in the functioning of computing systems and computing networks themselves.
- Furthermore, since the classifications and predictions identify specific orgs associated with a particular network addresses and information objects of interest to those orgs, the embodiments discussed herein can be used for other use cases such as, for example, network troubleshooting, anti-spam and anti-phishing technologies (e.g., for email systems and the like), cybersecurity threat detection and tracking, system/network monitoring and logging, network resource allocation and/or network appliance topology optimization, and/or the like.
- Machine learning (ML) involves programming computing systems to optimize a performance criterion using example (training) data and/or past experience. ML involves using algorithms to perform specific task(s) without using explicit instructions to perform the specific task(s), but instead relying on learnt patterns and/or inferences. ML uses statistics to build mathematical model(s) (also referred to as “ML models” or simply “models”) in order to make predictions or decisions based on sample data (e.g., training data). The model is defined to have a set of parameters, and learning is the execution of a computer program to optimize the parameters of the model using the training data or past experience. The trained model may be a predictive model that makes predictions based on an input dataset, a descriptive model that gains knowledge from an input dataset, or both predictive and descriptive. Once the model is learned (trained), it can be used to make inferences (e.g., predictions).
- ML algorithms perform a training process on a training dataset to estimate an underlying ML model. An ML algorithm is a computer program that learns from experience with respect to some task(s) and some performance measure(s)/metric(s), and an ML model is an object or data structure created after an ML algorithm is trained with training data. In other words, the term “ML model” or “model” may describe the output of an ML algorithm that is trained with training data. After training, an ML model may be used to make predictions on new datasets. Additionally, separately trained AI/ML models can be chained together in a AI/ML pipeline during inference or prediction generation. Although the term “ML algorithm” refers to different concepts than the term “ML model,” these terms may be used interchangeably for the purposes of the present disclosure.
- ML techniques generally fall into the following main types of learning problem categories: supervised learning, unsupervised learning, and reinforcement learning. Supervised learning is an ML task that aims to learn a mapping function from the input to the output, given a labeled data set. Supervised learning algorithms build models from a set of data that contains both the inputs and the desired outputs. For example, supervised learning may involve learning a function (model) that maps an input to an output based on example input-output pairs or some other form of labeled training data including a set of training examples. Each input-output pair includes an input object (e.g., a vector) and a desired output object or value (referred to as a “supervisory signal”). Supervised learning can be grouped into classification algorithms, regression algorithms, and instance-based algorithms
- Classification, in the context of ML, refers to an ML technique for determining the classes to which various data points belong. Here, the term “class” or “classes” may refer to categories, and are sometimes called “targets” or “labels.” Classification is used when the outputs are restricted to a limited set of quantifiable properties. Classification algorithms may describe an individual (data) instance whose category is to be predicted using a feature vector. As an example, when the instance includes a collection (corpus) of text, each feature in a feature vector may be the frequency that specific words appear in the corpus of text. In ML classification, labels are assigned to instances, and models are trained to correctly predict the pre-assigned labels of from the training examples. ML algorithms for classification may be referred to as a “classifier.” Examples of classifiers include linear classifiers, k-nearest neighbor (kNN), decision trees, random forests, support vector machines (SVMs), Bayesian classifiers, convolutional neural networks (CNNs), among many others (note that some of these algorithms can be used for other ML tasks as well).
- A regression algorithm and/or a regression analysis, in the context of ML, refers to a set of statistical processes for estimating the relationships between a dependent variable (often referred to as the “outcome variable”) and one or more independent variables (often referred to as “predictors”, “covariates”, or “features”). Examples of regression algorithms/models include logistic regression, linear regression, gradient descent (GD), stochastic GD (SGD), and the like.
- Instance-based learning (sometimes referred to as “memory-based learning”), in the context of ML, refers to a family of learning algorithms that, instead of performing explicit generalization, compares new problem instances with instances seen in training, which have been stored in memory. Examples of instance-based algorithms include k-nearest neighbor, and the like), decision tree Algorithms (e.g., Classification And Regression Tree (CART), Iterative Dichotomiser 3 (ID3), C4.5, chi-square automatic interaction detection (CHAID), and/or the like), Fuzzy Decision Tree (FDT), and the like), Support Vector Machines (SVM), Bayesian Algorithms (e.g., Bayesian network (BN), a dynamic BN (DBN), Naive Bayes, and the like), and ensemble algorithms (e.g., Extreme Gradient Boosting, voting ensemble, bootstrap aggregating (“bagging”), Random Forest, and the like.
- In the context of ML, an “ML feature” (or simply “feature”) is an individual measureable property or characteristic of a phenomenon being observed. Features are usually represented using numbers/numerals (e.g., integers), strings, variables, ordinals, real-values, categories, and/or the like. Additionally or alternatively, ML features are individual variables, which may be independent variables, based on observable phenomenon that can be quantified and recorded. ML models use one or more features to make predictions or inferences. In some implementations, new features can be derived from old features. A set of features may be referred to as a “feature vector.” A vector is a tuple of one or more values called scalars, and a feature vector may include a tuple of one or more features. The vector space associated with these vectors is often called a “feature space.” In order to reduce the dimensionality of the feature space, a number of dimensionality reduction techniques can be employed.
- Unsupervised learning is an ML task that aims to learn a function to describe a hidden structure from unlabeled data. Unsupervised learning algorithms build models from a set of data that contains only inputs and no desired output labels. Unsupervised learning algorithms are used to find structure in the data, like grouping or clustering of data points. Some examples of unsupervised learning are K-means clustering, principal component analysis (PCA), and topic modeling, among many others. In particular, topic modeling is an unsupervised machine learning technique scans a set of information objects (e.g., documents, webpages, and/or the like), detects word and phrase patterns within the information objects, and automatically clusters word groups and similar expressions that best characterize the set of information objects. Semi-supervised learning algorithms develop ML models from incomplete training data, where a portion of the sample input does not include labels. One example of unsupervised learning is topic modeling. Topic modeling involves counting words and grouping similar word patterns to infer topics within unstructured data. By detecting patterns such as word frequency and distance between words, a topic model clusters feedback that is similar, and words and expressions that appear most often. With this information, the topics of individual set of texts can be quickly deduced.
- Reinforcement learning (RL) is a goal-oriented learning based on interaction with environment. In RL, an agent aims to optimize a long-term objective by interacting with the environment based on a trial and error process. Examples of RL algorithms include Markov decision process, Markov chain, Q-learning, multi-armed bandit learning, and deep RL.
- An artificial neural network or neural network (NN) encompasses a variety of ML techniques where a collection of connected artificial neurons or nodes that (loosely) model neurons in a biological brain that can transmit signals to other arterial neurons or nodes, where connections (or edges) between the artificial neurons or nodes are (loosely) modeled on synapses of a biological brain. The artificial neurons and edges typically have a weight that adjusts as learning proceeds. The weight increases or decreases the strength of the signal at a connection. Neurons may have a threshold such that a signal is sent only if the aggregate signal crosses that threshold. The artificial neurons can be aggregated or grouped into one or more layers where different layers may perform different transformations on their inputs. Signals travel from the first layer (the input layer), to the last layer (the output layer), possibly after traversing the layers multiple times. NNs are usually used for supervised learning, but can be used for unsupervised learning as well. Examples of NNs include deep NN (DNN), feed forward NN (FFN), a deep FNN (DFF), convolutional NN (CNN), deep CNN (DCN), deconvolutional NN (DNN), a deep belief NN, a perception NN, recurrent NN (RNN) (e.g., including Long Short Term Memory (LSTM) algorithm, gated recurrent unit (GRU), and/or the like), deep stacking network (DSN).
-
FIG. 40 illustrates anexample NN 4000, which may be suitable for use by one or more of the computing systems (or subsystems) of the various implementations discussed herein, implemented in part by a HW accelerator, and/or the like. TheNN 4000 may be deep neural network (DNN) used as an artificial brain of a compute node or network of compute nodes to handle very large and complicated observation spaces. Additionally or alternatively, theNN 4000 can be some other type of topology (or combination of topologies), such as a convolution NN (CNN), deep CNN (DCN), recurrent NN (RNN), Long Short Term Memory (LSTM) network, a Deconvolutional NN (DNN), gated recurrent unit (GRU), deep belief NN, a feed forward NN (FFN), a deep FNN (DFF), deep stacking network, Markov chain, perception NN, Bayesian Network (BN) or Bayesian NN (BNN), Dynamic BN (DBN), Linear Dynamical System (LDS), Switching LDS (SLDS), Optical NNs (ONNs), an NN for reinforcement learning (RL) and/or deep RL (DRL), and/or the like. NNs are usually used for supervised learning, but can be used for unsupervised learning and/or RL. - The
NN 4000 may encompass a variety of ML techniques where a collection of connectedartificial neurons 4010 that (loosely) model neurons in a biological brain that transmit signals to other neurons/nodes 4010. Theneurons 4010 may also be referred to asnodes 4010, processing elements (PEs) 4010, or the like. The connections 4020 (or edges 4020) between thenodes 4010 are (loosely) modeled on synapses of a biological brain and convey the signals betweennodes 4010. Note that not allneurons 4010 andedges 4020 are labeled inFIG. 40 for the sake of clarity. - Each
neuron 4010 has one or more inputs and produces an output, which can be sent to one or more other neurons 4010 (the inputs and outputs may be referred to as “signals”). Inputs to theneurons 4010 of the input layer Lx can be feature values of a sample of external data (e.g., input variables xi). The input variables xi can be set as a vector containing relevant data (e.g., observations, ML features, and the like). The inputs to hiddenunits 4010 of the hidden layers La, Lb, and Lc may be based on the outputs ofother neurons 4010. The outputs of thefinal output neurons 4010 of the output layer Ly (e.g., output variables yj) include predictions, inferences, and/or accomplish a desired/configured task. The output variables yj may be in the form of determinations, inferences, predictions, and/or assessments. Additionally or alternatively, the output variables yj can be set as a vector containing the relevant data (e.g., determinations, inferences, predictions, assessments, and/or the like). - In the context of ML, an “ML feature” (or simply “feature”) is an individual measureable property or characteristic of a phenomenon being observed. Features are usually represented using numbers/numerals (e.g., integers), strings, variables, ordinals, real-values, categories, and/or the like. Additionally or alternatively, ML features are individual variables, which may be independent variables, based on observable phenomenon that can be quantified and recorded. ML models use one or more features to make predictions or inferences. In some implementations, new features can be derived from old features.
-
Neurons 4010 may have a threshold such that a signal is sent only if the aggregate signal crosses that threshold. Anode 4010 may include an activation function, which defines the output of thatnode 4010 given an input or set of inputs. Additionally or alternatively, anode 4010 may include a propagation function that computes the input to aneuron 4010 from the outputs of itspredecessor neurons 4010 and theirconnections 4020 as a weighted sum. A bias term can also be added to the result of the propagation function. - The
NN 4000 also includesconnections 4020, some of which provide the output of at least oneneuron 4010 as an input to at least anotherneuron 4010. Eachconnection 4020 may be assigned a weight that represents its relative importance. The weights may also be adjusted as learning proceeds. The weight increases or decreases the strength of the signal at aconnection 4020. - The
neurons 4010 can be aggregated or grouped into one or more layers L where different layers L may perform different transformations on their inputs. InFIG. 40 , theNN 4000 comprises an input layer Lx, one or more hidden layers La, Lb, and Lc, and an output layer Ly (where a, b, c, x, and y may be numbers), where each layer L comprises one ormore neurons 4010. Signals travel from the first layer (e.g., the input layer L1), to the last layer (e.g., the output layer Ly), possibly after traversing the hidden layers La, Lb, and Lc multiple times. InFIG. 40 , the input layer La receives data of input variables xi (where i=1, . . . , p, where p is a number). Hidden layers La, Lb, and Lc processes the inputs xi, and eventually, output layer Ly provides output variables yj (where j=1, . . . , p′, where p′ is a number that is the same or different than p). In the example ofFIG. 40 , for simplicity of illustration, there are only three hidden layers La, Lb, and Lc in theNN 4000, however, theNN 4000 may include many more (or fewer) hidden layers La, Lb, and Lc than are shown. - The features used may be implementation specific, and may be based on, for example, the objects to be detected and the model(s) to be developed and/or used. The evaluation phase involves identifying or classifying objects by comparing obtained image data with existing object models created during the enrollment phase. During the evaluation phase, features extracted from the image data are compared to the object identification models using a suitable pattern recognition technique. The object models may be qualitative or functional descriptions, geometric surface information, and/or abstract feature vectors, and may be stored in a suitable database that is organized using some type of indexing scheme to facilitate elimination of unlikely object candidates from consideration.
- In one example, the
NN 4000 is used for the motion detection based on the y sensor data obtained from the one or more sensors. In another example, theNN 4000 is used for object detection/classification. The object detection or recognition models may include an enrollment phase and an evaluation phase. During the enrollment phase, one or more features are extracted from the sensor data (e.g., image or video data). A feature is an individual measureable property or characteristic. For object detection, an object feature may include an object size, color, shape, relationship to other objects, and/or any region or portion of an image, such as edges, ridges, corners, blobs, and/or some defined regions of interest (ROI), and/or the like. - In another example, the
NN 4000 is used for object tracking. The object tracking and/or computer vision techniques may include, for example, edge detection, corner detection, blob detection, a Kalman filter, Gaussian Mixture Model, Particle filter, Mean-shift based kernel tracking, an ML object detection technique (e.g., Viola-Jones object detection framework, scale-invariant feature transform (SIFT), histogram of oriented gradients (HOG), and/or the like), a deep learning object detection technique (e.g., fully convolutional neural network (FCNN), region proposal convolution neural network (R-CNN), single shot multibox detector, ‘you only look once’ (YOLO) algorithm, and/or the like), and/or the like. - In another example, the
NN 4000 is used for character recognition, where the features may include histograms counting the number of black pixels along horizontal and vertical directions, number of internal holes, stroke detection and many others. In another example, theNN 4000 is used for topic classification, where features may include individual topics, tokens, sequences of tokens, term frequency, document frequency and/or inverse document frequency, log-count ratios (e.g., when a Naive Bayes algorithm is used), and/or the like. - In another example, the
NN 4000 is a feature extraction model that learns to extract salient features from information objects represented using, for example, word embedding. Word embedding is a distributed representation of words where different words that have a similar meaning (e.g., based on their usage) also have a similar representation. Additionally or alternatively, a fully connected model may be used to interpret the extracted features in terms of predictive output. In this example, theNN 4000 may be a Convolution NN (CNN), which may be integrated into a larger network, and to be trained to work in tandem with it in order to produce an end result. The CNNs layer's responsibility is to extract meaningful sub-structures that are useful for the overall prediction task at hand. Other aspects of CNNs and theNN 4000 being used for topic classification and/or NLP are discussed in [Goldberg]. - ML may require, among other things, obtaining and cleaning a dataset, performing feature selection, selecting an ML algorithm, dividing the dataset into training data and testing data, training a model (e.g., using the selected ML algorithm), testing the model, optimizing or tuning the model, and determining metrics for the model. Some of these tasks may be optional or omitted depending on the use case and/or the implementation used. ML algorithms accept parameters and/or hyperparameters (collectively referred to herein as “training parameters,” “model parameters,” or simply “parameters” herein) that can be used to control certain properties of the training process and the resulting model.
- Parameters are characteristics or properties of the training process that are learnt during training. Model parameters may differ for individual experiments and may depend on the type of data and ML tasks being performed. Hyperparameters are characteristics, properties, or parameters for a training process that cannot be learnt during the training process and are set before training takes place. The particular values selected for the parameters and/or hyperparameters affect the training speed, training resource consumption, and the quality of the learning process. As examples, model parameters for topic classification/modeling, natural language processing (NLP), and/or natural language understanding (NLU) may include word frequency, sentence length, noun or verb distribution per sentence, the number of specific character n-grams per word, lexical diversity, constraints, weights, and the like. Examples of hyperparameters may include model size (e.g., in terms of memory space or bytes), whether (and how much) to shuffle the training data, the number of evaluation instances or epochs (e.g., a number of iterations or passes over the training data), learning rate (e.g., the speed at which the algorithm reaches (converges to) the optimal weights), learning rate decay (or weight decay), the number and size of the hidden layers, weight initialization scheme, dropout and gradient clipping thresholds, and the like. In embodiments, the parameters and/or hyperparameters may additionally or alternatively include vector size and/or word vector size.
- Any of the AI/ML techniques discussed herein can be utilized, in whole or in part, and variants and/or combinations thereof, for any of the example embodiments discussed herein.
-
FIG. 1 depicts a content consumption monitor (CCM) 100.CCM 100 includes one or more physical and/or virtualized systems that communicates with aservice provider 118 and monitors user accesses to information object(s) 112 (e.g., third party content and/or the like). The physical and/or virtualized systems include one or more logically or physically connected servers and/or data storage devices distributed locally or across one or more geographic locations. In some implementations, theCCM 100 may be provided by (or operated by) a cloud computing service and/or a cluster of machines in a datacenter. In some implementations, theCCM 100 may be a distributed application provided by (or operated by) various servers of a content delivery network (CDN) or edge computing network. Other implementations are possible in other embodiments. - Service provider 118 (also referred to as a “publisher,” “B2B publisher,” or the like) comprises one or more physical and/or virtualized computing systems owned and/or operated by a company, enterprise, and/or individual that wants to send information object(s) 114 to an interested group of users, which may include targeted content or the like. This group of users is alternatively referred to as “
contact segment 124.” The physical and/or virtualized systems include one or more logically or physically connected servers and/or data storage devices distributed locally or across one or more geographic locations. Generally, theservice provider 118 uses IP/network resources to provide information objects such as electronic documents, webpages, forms, applications (e.g., web apps), data, services, web services, media, and/or content to different user/client devices. As examples, theservice provider 118 may provide search engine services; social media/networking services; content (media) streaming services; e-commerce services; blockchain services; communication services; immersive gaming experiences; and/or other like services. The user/client devices that utilize services provided byservice provider 118 may be referred to as “subscribers.” AlthoughFIG. 1 shows only asingle service provider 118, theservice provider 118 may representmultiple service providers 118, each of which may have their own subscribing users. - In one example,
service provider 118 may be a company that sells electric cars.Service provider 118 may have acontact list 120 of email addresses for customers that have attended prior seminars or have registered on the service provider's 118 website.Contact list 120 may also be generated byCCM tags 110 that are described in more detail below.Service provider 118 may also generatecontact list 120 from lead lists provided by third parties lead services, retail outlets, and/or other promotions or points of sale, or the like or any combination thereof.Service provider 118 may want to send email announcements for an upcoming electric car seminar.Service provider 118 would like to increase the number of attendees at the seminar In another example,service provider 118 may be a platform or service provider that offers a variety of user targeting services to their subscribers such as sales enablement, digital advertising, content/engagement marketing, and marketing automation, among others. - The information objects 112 comprise any data structure including or indicating information on any subject accessed by any user. The information objects 112 may include any type of information object (or collection of information objects). Information objects 112 may include electronic documents, database objects, electronic files, resources, and/or any data structure that includes one or more data elements, each of which may include one or more data values and/or content items.
- In some implementations, the information objects 112 may include webpages provided on (or served) by one or more web servers and/or application servers operated by different service provides, businesses, and/or individuals. For example, information objects 112 may come from different websites operated by online retailers and wholesalers, online newspapers, universities, blogs, municipalities, social media sites, or any other entity that supplies content. Additionally or alternatively, information objects 112 may also include information not accessed directly from websites. For example, users may access registration information at seminars, retail stores, and other events. Information objects 112 may also include content provided by
service provider 118. Additionally, information objects 112 may be associated with one ormore topics 102. Thetopic 102 of aninformation object 112 may refer to the subject, meaning, and/or theme of thatinformation object 112. - The
CCM 100 may identify or determine one ormore topics 102 of aninformation object 112 using a topic analysis model/technique. Topic analysis (also referred to as “topic detection,” “topic modeling,” or “topic extraction”) refers to ML techniques that organize and understand large collections of text data by assigning tags or categories according to each individual information object's 112 topic or theme. A topic model is a type of statistical model used for discoveringtopics 102 that occur in a collection of information objects 112 or other collections of text. A topic model may be used to discover hidden semantic structures in the information objects 112 or other collections of text. In one example, a topic classification technique is used, where a topic classification model is trained on a set of training data (e.g., information objects 112 labeled with tags/topics 102) and then tested on a set of test data to determine how well the topic classification model classifies data intodifferent topics 102. Once trained, the topic classification model is used to determine/predicttopics 102 in various information objects 112. In another example, a topic modeling technique is used, where a topic modeling model automatically analyzes information objects 112 to determine cluster words for a set of documents. Topic modeling is an unsupervised ML technique that does not require training using training data. Any suitable topic classification, topic modeling, and/or NLP/NLU techniques may be used for the topic analysis such as those discussed herein and/or as discussed in Yoav Goldberg, “Neural Network Methods in Natural Language Processing”, Synthesis Lectures on Human Language Technologies, Lecture #37, Morgan & Claypool (17 Apr. 2017) (hereinafter “[Goldberg]”), which is hereby incorporated by reference in its entirety. - Computers and/or servers associated with
service provider 118,content segment 124, and theCCM 100 may communicate over the Internet or any other wired or wireless network including local area networks (LANs), wide area networks (WANs), wireless networks, cellular networks, WiFi networks, Personal Area Networks (e.g., Bluetooth® and/or the like), Digital Subscriber Line (DSL) and/or cable networks, and/or the like, and/or any combination thereof. - Some of information objects 112 contain
CCM tags 110 that capture and send network session events 108 (or simply “events 108”) toCCM 100. For example, CCM tags 110 may comprise JavaScript added to webpages of a website (or individual components of a web app or the like). The website downloads the webpages, along withCCM tags 110, to user computers (e.g.,computer 230 ofFIG. 2 ). CCM tags 110 (e.g., when executed by a user computer) monitor sessions and send some or all capturedsession events 108 toCCM 100. - In one example, the CCM tags 110 may intercept or otherwise obtain HTTP messages being sent by and/or sent to a
computer 230, and these HTTP messages may be provided to theCCM 100 as theevents 108. In this example, the CCM tags 110 or theCCM 100 may extract or otherwise obtain a network address of thecomputer 230 from an X-Forwarded-For (XFF) field of the HTTP header, a time and date that the HTTP message was sent from a Date field of the HTTP header, and/or a user agent string contained in a User Agent field of an HTTP header of the HTTP message. The user agent string may indicate the operating system (OS) type/version of the sending device (e.g., a computer 230); system information of the sending device (e.g., a computer 230); browser version/type of the sending device (e.g., a computer 230); rendering engine version/type of the sending device (e.g., a computer 230); a device type of the of the sending device (e.g., a computer 230), as well as other information. In another example, the CCM tags 110 may derive various information from thecomputer 230 that is not typically included in an HTTP header, such as time zone information, GPS coordinates, screen or display resolution of thecomputer 230, data from one or more applications operated by thecomputer 230, and/or other like information. In various implementations, the CCM tags 110 may generate and sendevents 108 or messages based on the monitored session. For example, the CCM tags 110 may obtain data when various events/triggers are detected, and may send back information (e.g., in additional HTTP messages). Other methods may be used to obtain or derive user information. - In some implementations, the information objects 112 that include
CCM tags 110 may be provided or hosted by a collection ofservice providers 118 such as, for example, notable business-to-business (B2B) publishers, marketers, agencies, technology providers, research firms, events firms, and/or any other desired entity/org type. This collection ofservice providers 118 may be referred to as a “data cooperative” or “data co-op.” Additionally or alternatively,events 108 may be collected by one or more other data tracking entities separate from theCCM 100, and provided as one or more datasets to the CCM 100 (e.g., a “bulk” dataset or the like). In one example, theCCM 100 or other data tracking entity may implement one or more event listeners (or event handlers) to detectevents 108 and perform some action in response to detecting the events 108 (not shown byFIG. 1 ) such as any of the actions/processes described herein. An event handler or event listener is a software element/entity that handles received inputs (e.g., events 108). - Here, a “session” may refer to a temporary and interactive information interchange between two or more communicating devices, between a computer and user, and/or between two or more remote devices/systems. Additionally or alternatively, a “session” may refer to a unit of measurement of a user's actions taken within a period of time and/or with regard to completion of a task (these types of sessions may also be referred to as a “visit”). In these implementations, a session may be a time-oriented session based on continuity in user activity or a navigation-oriented session based on continuity in a chain of requested and/or consumed information objects. Time-oriented sessions are based on a set or predefined period of user inactivity (referred to as an “inactivity threshold”). Once this period of inactivity is reached the session is ended (e.g., the user is assumed to have left a website or stopped using the browser/client entirely). Additional requests from the same user after the inactivity threshold are considered to be part of a second session. Navigation-oriented sessions are based on a user moving between different information objects, such as by navigating between different webpages of a website using respective hyperlinks. Additionally or alternatively, the sessions may be (or include) predefined sessions such as database sessions, units of work, client or browser sessions, server sessions, remote sessions (e.g., remote desktop session, and/or the like), network sessions, web sessions, HTTP sessions, telnet remote login sessions, Session Initiation Protocol (SIP) sessions, Transport Control Protocol (TCP) sessions (e.g., a TCP virtual circuit, a TCP connection, or an established TCP socket), User Datagram Protocol (UDP) sessions, cellular network sessions, and/or the other type(s) of sessions.
Events 108 are raised or triggered for one or more operations or actions performed during a session. -
Individual events 108 are actions or occurrences recognized by a system, device, and/or software element, which may originate asynchronously from an external environment. Eachevent 108 may be or include a piece of information from an underlying framework and/or may represent the availability of data for reading a file or network stream. In various embodiments,individual events 108 identify or indicate information objects 112 and the user accessing the information objects 112. For example, anevent 108 may include a URL or link to aninformation object 112 and may include an identifier (ID) associated with the user that access the indicatedinformation object 112, such as a hashed email address or cookie identifier (ID) associated with the user.Events 108 may also identify an access activity associated with the indicated information objects 112. For example, anevent 108 may indicate that the user viewed a webpage, downloaded an electronic document, registered for a seminar, and how the user completed these actions (e.g., referred from a search engine, a link in an email, typed a URL in a search bar of a web browser, and/or the like). Additionally or alternatively,events 108 may identify various user interactions with information objects 112 such as, for example, topic consumption, scroll velocity, dwell time, and/or other user interactions such as those discussed herein. In one example, thetags 110 may collect anonymized information about a visiting user's network address (e.g., IP address), an anonymized cookie ID, a timestamp of when the user visited or accessed aninformation object 112, and/or geo-location information associated with the user's computing device. In some embodiments, device fingerprinting can be used to track users, while in other embodiments, device fingerprinting may be excluded to preserver user anonymity. Additionally or alternatively,events 108 may be or include database session events, work unit events, client or browser session events, server session events, remote session events (e.g., remote desktop session events, and/or the like), network session events, web session events, HTTP session events, telnet remote login session events, SIP session events, TCP session events, UDP session events, cellular network events and/or other events of other session types. -
CCM 100 builds user profiles 104 fromevents 108. User profiles 104 may include anonymous identifiers 105 that associate information objects 112 with particular users. User profiles 104 may also includeintent data 106.Intent data 106 includes or indicates insights into users' interests and may include predictions about their potential to take certain actions based on their content consumption. Theintent data 106 identifies or indicatestopics 102 in information objects 112 accessed by the users. For example,intent data 106 may comprise a user intent vector (e.g., user intent vector 245 ofFIG. 2 , intent vector 594 ofFIG. 5 , and/or the like) that identifies or indicates thetopics 102 and identifies levels of user interest in thetopics 102. - This approach to
intent data 106 collection makes possible a consistent and stable historical baseline for measuring content consumption. This baseline effectively spans the web, delivering at an exponential scale greater than any one site. In embodiments, theCCM 100 monitors content consumption behavior from a collection of service providers 118 (e.g., the aforementioned data co-op) and applies data science and/or ML techniques to identify changes in activity compared to the historical baselines. As examples, research frequency, depth of engagement, and content relevancy all contribute to measuring an org's interest in topic(s) 102. In some embodiments, theCCM 100 may employ an NLP/NLU engine that reads, deciphers, and understands content across a taxonomy ofintent topics 102 that grows on a periodic basis (e.g., monthly, weekly, and/or the like). The NLP/NLU engine may operate or execute the topic analysis models discussed previously. - As mentioned previously,
service provider 118 may want to send an email announcing an electric car seminar to aparticular contact segment 124 of users interested in electric cars.Service provider 118 may send information object(s) 114, such as the aforementioned email toCCM 100, and theCCM 100 identifiestopics 102 in information object(s) 114. TheCCM 100 comparescontent topics 102 with theintent data 106, and identifies user profiles 104 that indicate an interest in information object(s) 114. Then, theCCM 100 sends an anonymous contact segment 116 toservice provider 118, which includes anonymized or pseudonymized identifiers 105 associated with the identified user profiles 104. In some embodiments, theCCM 100 includes an anonymizer or pseudonymizer, which is the same or similar toanonymizer 122, to anonymize or pseudonymize user identifiers. -
Contact list 120 may include personally identifying information (PII) and/or personal data such as email addresses, names, phone numbers, or some other user identifier(s), or any combination thereof. Additionally or alternatively, thecontact list 120 may include sensitive data and/or confidential information. The personal, sensitive, and/or confidential data incontact list 120 are anonymized or pseudonymized or otherwise de-identified by ananonymizer 122. - The
anonymizer 122 may anonymize or pseudonymize any personal, sensitive, and/or confidential data using any number of data anonymization or pseudonymization techniques including, for example, data encryption, substitution, shuffling, number and date variance, and nulling out specific fields or data sets. Data encryption is an anonymization or pseudonymization technique that replaces personal/sensitive/confidential data with encrypted data. A suitable hash algorithm may be used as an anonymization or pseudonymization technique in some embodiments. Anonymization is a type of information sanitization technique that removes personal, sensitive, and/or confidential data from data or datasets so that the person or information described or indicated by the data/datasets remain anonymous. Pseudonymization is a data management and de-identification procedure by which personal, sensitive, and/or confidential data within information objects (e.g., fields and/or records, data elements, documents, and/or the like) is/are replaced by one or more artificial identifiers, or pseudonyms. In most pseudonymization mechanisms, a single pseudonym is provided for each replaced data item or a collection of replaced data items, which makes the data less identifiable while remaining suitable for data analysis and data processing. Although “anonymization” and “pseudonymization” refer to different concepts, these terms may be used interchangeably throughout the present disclosure. - The
service provider 118 compares the anonymized/pseudonymized identifiers (e.g., hashed identifiers) fromcontact list 120 with the anonymous identifiers 105 in anonymous contact segment 116. Any matching identifiers are identified ascontact segment 124.Service provider 118 identifies the unencrypted email addresses incontact list 120 associated withcontact segment 124.Service provider 118 sends information object(s) 114 to the addresses (e.g., email addresses) identified forcontact segment 124. For example,service provider 118 may send an email announcing the electric car seminar to contactsegment 124. - Sending information object(s) 114 to contact
segment 124 may generate a substantial lift in the number ofpositive responses 126. For example, assumeservice provider 118 wants to send emails announcing early bird specials for the upcoming seminar. The seminar may include ten different tracks, such as electric cars, environmental issues, renewable energy, and/or the like. In the past,service provider 118 may have sent ten different emails for each separate track to everyone incontact list 120. -
Service provider 118 may now only send the email regarding the electric car track to contacts identified incontact segment 124. The number ofpositive responses 126 registering for the electric car track of the seminar may substantially increase sincecontent 114 is now directed to users interested in electric cars. - In another example,
CCM 100 may provide local ad campaign or email segmentation. For example,CCM 100 may provide a “yes” or “no” as to whether a particular advertisement should be shown to a particular user. In this example,CCM 100 may use the hashed data without re-identification of users and the “yes/no” action recommendation may key off of a de-identified hash value. -
CCM 100 may revitalize cold contacts in serviceprovider contact list 120.CCM 100 can identify the users incontact list 120 that are currently accessing other information objects 112 and identify the topics associated with information objects 112. By monitoring accesses to information objects 112,CCM 100 may identify current user interests even though those interests may not align with the content currently provided byservice provider 118.Service provider 118 might reengage the cold contacts by providingcontent 114 more aligned with the most relevant topics identified in information objects 112. -
FIG. 2 is a diagram explaining the content consumption manager in more detail. A user may enter asearch query 232 into acomputer 230, for example, via a search engine. Thecomputer 230 may include any communication and/or processor circuitry including but not limited to desktop computers, workstations, laptop computers, smartphones, tablet computers, wearable devices, servers, smart appliances, network appliances, and/or the like, or any combination thereof. The user may work for an organization Y (org_Y). For example, the user may have an associated email address: user@org_y.com. - In response to
search query 232, the search engine may display links or other references toinformation objects computer 230 that includes a link to information object 112A, which may be a white paper in this example. Website1 may include one or more webpages with CCM tags 110A that capturedifferent events 108 during a network session (e.g., web session) between website1 and computer 230 (or between website1 and the client app operated by computer 230). Website1 or another website may have downloaded a cookie onto a web browser operating oncomputer 230. The cookie may comprise an identifier X, such as a unique alphanumeric set of characters associated with the web browser oncomputer 230. - During the session with website1, the user of
computer 230 may click on a link towhite paper 112A. In response to the mouse click, CCM tag 110A may download anevent 108A toCCM 100.Event 108A may identify the cookie identifier X loaded on the web browser ofcomputer 230. In addition, or alternatively, CCM tag 110A may capture a user name and/or email address entered into one or more webpage fields during the session.CCM tag 110 hashes the email address and includes the hashed email address inevent 108A. Any identifier associated with the user is referred to generally as user X or user ID. - CCM tag 110A may also include a link in
event 108A to the white paper downloaded from website1 tocomputer 230. For example, CCM tag 110A may capture the URL forwhite paper 112A. CCM tag 110A may also include an event type identifier inevent 108A that identifies an action or activity associated withinformation object 112A. For example, CCM tag 110A may insert an event type identifier intoevent 108A that indicates the user downloaded an electric document. - CCM tag 110A may also identify the launching platform for accessing
information object 112B. For example, CCM tag 110B may identify a link www.searchengine.com to the search engine used for accessing website1. - An
event profiler 240 inCCM 100 forwards the URL identified inevent 108A to acontent analyzer 242.Content analyzer 242 generates a set oftopics 236 associated with or suggested bywhite paper 112A. For example,topics 236 may include electric cars, cars, smart cars, electric batteries, and/or the like. Eachtopic 236 may have an associated relevancy score indicating the relevancy of the topic inwhite paper 112A. Content analyzers that identify topics in documents are known to those skilled in the art and are therefore not described in further detail. -
Event profiler 240 forwards the user ID,topics 236, event type, and any other data fromevent 108A toevent processor 244.Event processor 244 may store personal information captured inevent 108A in apersonal database 248. For example, during the session with website1, the user may have entered an employer company name into a webpage form field. CCM tag 110A may copy the employer company name intoevent 108A. Alternatively,CCM 100 may identify the company name from a domain name of the user email address. -
Event processor 244 may store other demographic information fromevent 108A inpersonal database 248, such as user job title, age, sex, geographic location (postal address), and/or the like. In one example, some of the information inpersonal database 248 is hashed, such as the user ID and or any other personally identifiable information. Other information inpersonal database 248 may be anonymous to any specific user, such as org name and job title. -
Event processor 244 builds a user intent vector 245 from topic vectors (e.g., the set oftopics 236 and/or the like).Event processor 244 continuously updates user intent vector 245 based on otherreceived events 108. For example, the search engine may display a second link to website2 in response tosearch query 232. User X may click on the second link and website2 may download a webpage tocomputer 230 announcing the seminar on electric cars. - The webpage downloaded by website2 may also include a CCM tag 110B. User X may register for the seminar during the session with website2. CCM tag 110B may generate a
second event 108B that includes the user ID: X, a URL link to the webpage announcing the seminar, and an event type indicating the user registered for the electric car seminar advertised on the webpage. - CCM tag 110B sends
event 108B toCCM 100.Content analyzer 242 generates a second set oftopics 236.Event 108B may contain additional personal information associated with user X.Event processor 244 may add the additional personal information topersonal database 248. -
Event processor 244 updates user intent vector 245 based on the second set oftopics 236 identified forevent 108B.Event processor 244 may add new topics to user intent vector 245 or may change the relevancy scores for existing topics. For example, topics identified in bothevent Event processor 244 may also adjust relevancy scores based on the associated event type identified inevents 108. -
Service provider 118 may submit asearch query 254 toCCM 100 via auser interface 252 on acomputer 255. For example,search query 254 may ask “who is interested in buying electric cars?” Atransporter 250 inCCM 100 searches user intent vectors 245 for electric car topics with high relevancy scores.Transporter 250 may identify user intent vector 245 for user X.Transporter 250 identifies user X and other users A, B, and C interested in electric cars in search results 256. - As mentioned above, the user IDs may be hashed and
CCM 100 may not know the actual identities of users X, A, B, andC. CCM 100 may provide a segment of hashed user IDs X, A, B, and C toservice provider 118 in response toquery 254. -
Service provider 118 may have acontact list 120 of users (see e.g.,FIG. 1 ).Service provider 118 may hash email addresses incontact list 120 and compare the hashed identifiers with the encrypted or hashed user IDs X, A, B, andC. Service provider 118 identifies the unencrypted email address for matching user identifiers.Service provider 118 then sends information related to electric cars to the email addresses of the identified user segment. For example,service provider 118 may send emails containing white papers, advertisements, articles, announcements, seminar notifications, or the like, or any combination thereof. -
CCM 100 may provide other information in response tosearch query 254. For example,event processor 244 may aggregate user intent vectors 245 for users employed by the same company Y into an org intent vector. The org intent vector for org Y may indicate a strong interest in electric cars. Accordingly,CCM 100 may identify org Y in search results 256. By aggregating user intent vectors 245,CCM 100 can identify the intent of a company or other category without disclosing any specific user personal information (e.g., without regarding a user's online browsing activity). -
CCM 100 continuously receivesevents 108 for different third party content.Event processor 244 may aggregateevents 108 for a particular time period, such as for a current day, for the past week, or for the past 30 days.Event processor 244 then may identify trendingtopics 258 within that particular time period. For example,event processor 244 may identify the topics with the highest average relevancy values over the last 30 days. -
Different filters 259 may be applied to the intent data stored inevent database 246. For example, filters 259 may directevent processor 244 to identify users in a particular company Y that are interested in electric cars. In another example, filters 259 may directevent processor 244 to identify companies with less than 200 employees that are interested in electric cars. -
Filters 259 may also directevent processor 244 to identify users with a particular job title that are interested in electric cars or identify users in a particular city that are interested in electric cars.CCM 100 may use any demographic information inpersonal database 248 for filteringquery 254. -
CCM 100 monitors content accessed from multiple different third party websites. This allowsCCM 100 to better identify the current intent for a wider variety of users, companies, or any other demographics.CCM 100 may use hashed and/or other anonymous identifiers to maintain user privacy.CCM 100 further maintains user anonymity by identifying the intent of generic user segments, such as companies, marketing groups, geographic locations, or any other user demographics. -
FIG. 3 depicts example operations performed by CCM tags 110. Inoperation 370, aservice provider 118 provides a list of form fields 374A, 374B for monitoring onwebpages 376A, 376B. Inoperation 372, CCM tags 110 are generated and loaded inwebpages 376A, 376B on the service provider's 118 website. For example, CCM tag 110A is loaded onto afirst webpage 376A of the service provider's 118 website and a CCM tag 110B is loaded onto a second webpage 376B of the service provider's 118 website. In one example, CCM tags 110 comprise JavaScript loaded into the webpage document object model (DOM). - The
service provider 118 may downloadwebpages 376A, 376B, along withCCM tags 110, to user computers (e.g.,computer 230 ofFIG. 2 ) during sessions. Additionally or alternatively, the CCM tags 110 may be executed when the user computers access and/or load thewebpages 376A, 376B (e.g., within a browser, mobile app, or other client application). CCM tag 110A captures the data entered into some of form fields 374A and CCM tag 110B captures data entered into some of form fields 374B. - A user enters information into form fields 374A and 374B during the session. For example, the user may enter an email address into one of form fields 374A during a user registration process or a shopping cart checkout process. CCM tags 110 may capture the email address in
operation 378, validate and hash the email address, and then send the hashed email address toCCM 100 inevent 108. - CCM tags 110 may first confirm the email address includes a valid domain syntax and then use a hash algorithm to encode the valid email address string. CCM tags 110 may also capture other anonymous user identifiers, such as a cookie identifier. If no identifiers exist,
CCM tag 110 may create a unique identifier. Other data may be captured as well, such as client app data, data mined from other applications, and/or other data from the user computers. - CCM tags 110 may capture any information entered into fields 374. For example, CCM tags 110 may also capture user demographic data, such as organization (org) name, age, sex, postal address, and/or the like. In one example, CCM tags 110 capture some the information for service
provider contact list 120. - CCM tags 110 may also identify information object 112 and associated event activities in
operation 378. For example, CCM tag 110A may detect a user downloading thewhite paper 112A or registering for a seminar (e.g., through an online form or the like hosted by website1 or some other website or web app). CCM tag 110A captures the URL forwhite paper 112A and generates an event type identifier that identifies the event as a document download. - Depending on the application,
CCM tag 110 inoperation 378 sends the captured web session information inevent 108 toservice provider 118 and/or toCCM 100. For example,event 108 is sent toservice provider 118 whenCCM tag 110 is used for generating serviceprovider contact list 120. In another example, theevent 108 is sent toCCM 100 whenCCM tag 110 is used for generating intent data. - CCM tags 110 may capture session information in response to the user leaving webpage 376, existing one of form fields 374, selecting a submit icon, moussing out of one of form fields 374, mouse clicks, an off focus, and/or any other user action. Note again that
CCM 100 might never receive personally identifiable information (PII) since any PII data inevent 108 is hashed byCCM tag 110. -
FIG. 4 is a diagram showing how the CCM generatesintent data 106. As mentioned previously, aCCM tag 110 may send a capturedraw event 108 toCCM 100. For example, theCCM tag 110 may sendevent 108 toCCM 100 in response to a user downloading a white paper. In this example, theevent 108 may include a timestamp indicating when the white paper was downloaded, an identifier (ID) forevent 108, a user ID associated with the user that downloaded the white paper, a URL for the downloaded white paper, and a network address for the launching platform for the content.Event 108 may also include an event type indicating, for example, that the user downloaded an electronic document. -
Event profiler 240 andevent processor 244 may generateintent data 106 from one ormore events 108.Intent data 106 may be stored in a structured query language (SQL) database or non-SQL database. In one example,intent data 106 is stored in user profile 104A and includes auser ID 452 and associated event data 454 (including event data 454A, 454B, and 454C). - Event data 454A is associated with a user downloading a white paper.
Event profiler 240 identifies acar topic 402 and afuel efficiency topic 402 in the white paper.Event profiler 240 may assign a 0.5 relevancy value to the car topic and assign a 0.6 relevancy value to thefuel efficiency topic 402. -
Event processor 244 may assign aweight value 464 to event data 454A.Event processor 244 may assign larger aweight value 464 to more assertive events, such as downloading the white paper.Event processor 244 may assign asmaller weight value 464 to less assertive events, such as viewing a webpage.Event processor 244 may assign other weight values 464 for viewing or downloading different types of media, such as downloading a text, video, audio, electronic books, on-line magazines and newspapers, and/or the like. -
CCM 100 may receive asecond event 108 for a second piece of content accessed by the same user.CCM 100 generates and stores event data 454B for thesecond event 108 in user profile 104A.Event profiler 240 may identify a first car topic with a relevancy value of 0.4 and identify a second cloud computing topic with a relevancy value of 0.8 for the content associated with event data 454B.Event processor 244 may assign a weight value of 0.2 to event data 454B. -
CCM 100 may receive athird event 108 for a third piece of content accessed by the same user.CCM 100 generates and stores event data 454C for thethird event 108 in user profile 104A.Event profiler 240 identifies a first topic associated with electric cars with a relevancy value of 1.2 and identifies a second topic associated with batteries with a relevancy value of 0.8.Event processor 244 may assign a weight value of 0.4 to event data 454C. - Event data 454 and associated weighting values 464 may provide a better indicator of user interests/intent. For example, a user may complete forms on a service provider website indicating an interest in cloud computing. However,
CCM 100 may receiveevents 108 for third party content accessed by the same user.Events 108 may indicate the user downloaded a whitepaper discussing electric cars and registered for a seminar related to electric cars. -
CCM 100 generatesintent data 106 based on receivedevents 108. Relevancy values 466 in combination withweighting values 464 may indicate the user is highly interested in electric cars. Even though the user indicated an interest in cloud computing on the service provider website,CCM 100 determined from the third party content that the user was actually more interested in electric cars. -
CCM 100 may store other personal user information fromevents 108 in user profile 104B. For example,event processor 244 may storethird party identifiers 460 and attributes 462 associated withuser ID 452.Third party identifiers 460 may include user names or any other identifiers used by third parties for identifyinguser 452.Attributes 462 may include an org name (e.g., employer company name), org size, country, job title, hashed domain name, and/or hashed email addresses associated withuser ID 452.Attributes 462 may be combined fromdifferent events 108 received from different websites accessed by the user.CCM 100 may also obtain different demographic data in user profile 104 from third party data sources (whether sourced online or offline). - An aggregator may use user profile 104 to update and/or aggregate intent data for different segments, such as service provider contact lists, companies, job titles, and/or the like. The aggregator may also create snapshots of
intent data 106 for selected time periods. -
Event processor 244 may generateintent data 106 for both known and unknown users. For example, the user may access a webpage and enter an email address into a form field in the webpage. ACCM tag 110 captures and hashes the email address and associates the hashed email address withuser ID 452. - The user may not enter an email address into a form field. Alternatively, the
CCM tag 110 may capture an anonymous cookie ID inevent 108.Event processor 244 then associates the cookie ID withuser identifier 452. The user may clear the cookie or access data on a different computer.Event processor 244 may generate adifferent user identifier 452 andnew intent data 106 for the same user. - The cookie ID may be used to create a de-identified cookie data set. The de-identified cookie data set then may be integrated with ad platforms or used for identifying destinations for target advertising.
-
CCM 100 may separately analyzeintent data 106 for the different anonymous user IDs. If the user ever fills out a form providing an email address, event processor then may re-associate thedifferent intent data 106 with thesame user identifier 452. -
FIG. 5 depicts an example of how theCCM 100 generates a user intent vector 594 from the event data described previously inFIG. 4 . The user intent vector 594 may be the same or similar as user intent vector 245 ofFIG. 2 . A user may use computer 530 (which may be the same or similar to thecomputer 230 ofFIG. 2 ) to access different information objects 582 (including information objects 582A, 582B, and 582C, and also referred to as “content 582”). For example, the user may download a information object 282A (e.g., a white paper associated with storage virtualization), register for a network security seminar on ainformation object 582B (e.g., a webpage with form fields), and view aninformation object 582C (e.g., a webpage article related to virtual private networks (VPNs)). As examples, information objects 582A, 582B, and 582C may come from the same website or come from different websites. - The CCM tags 110 capture three
events CCM 100 identifiestopics 586 incontent Topics 586 include virtual storage, network security, and VPNs.CCM 100 assignsrelevancy values 590 totopics 586 based on known algorithms. For example, relevancy values 590 may be assigned based on the number of times different associated keywords are identified in content 582. -
CCM 100 assigns weight values 588 to content 582 based on the associated event activity. For example,CCM 100 assigns a relatively high weight value of 0.7 to a more assertive off-line activity, such as registering for the network security seminar.CCM 100 assigns a relatively low weight value of 0.2 to a more passive on-line activity, such as viewing the VPN webpage. -
CCM 100 generates a user intent vector 594 in user profile 104 based on the relevancy values 590. For example,CCM 100 may multiplyrelevancy values 590 by the associated weight values 588.CCM 100 then may sum together the weighted relevancy values for the same topics to generate user intent vector 594. -
CCM 100 uses intent vector 594 to represent a user, represent content accessed by the user, represent user access activities associated with the content, and effectively represent the intent/interests of the user. In another embodiment,CCM 100 may assign each topic in user intent vector 594 a binary score of 1 or 0.CCM 100 may use other techniques for deriving user intent vector 594. For example,CCM 100 may weigh the relevancy values based on timestamps. -
FIG. 6 depicts an example of how theCCM 100 segments users.CCM 100 may generateuser intent vectors service provider 118 may want to emailcontent 698 to a segment of interested users. The service provider submitscontent 698 toCCM 100.CCM 100 identifiestopics 586 and associated relevancy values 600 (includingrelevancy values content Z 698. -
CCM 100 may use any variety of different algorithms to identify a segment of user intent vectors 594 associated withcontent 698. For example,relevancy value 600B indicatescontent 698 is primarily related to network security.CCM 100 may identify any user intent vectors 594 that include a network security topic with a relevancy value above a given threshold value. - In this example, assume the relevancy value threshold for the network security topic is 0.5.
CCM 100 identifiesuser intent vector 594A as part of the segment of users satisfying the threshold value. Accordingly,CCM 100 sends the service provider of content 698 a contact segment that includes the user ID associated withuser intent vector 594A. As mentioned above, the user ID may be a hashed email address, cookie ID, or some other encrypted or unencrypted identifier associated with the user. - In another example,
CCM 100 calculates vector cross products between user intent vectors 594 andcontent 698. Any user intent vectors 594 that generate a cross product value above a given threshold value are identified byCCM 100 and sent to theservice provider 118. -
FIG. 7 depicts examples of how theCCM 100 aggregatesintent data 106. In this example, aservice provider 118 operating a computer 702 (which may be the same or similar ascomputer 230 andcomputer 530 ofFIGS. 2 and 5 ) submits asearch query 704 toCCM 100 asking what companies are interested in electric cars. In this example,CCM 100 associates fivedifferent topics 586 with user profiles 104.Topics 586 include storage virtualization, network security, electric cars, e-commerce, and finance. -
CCM 100 generates user intent vectors 594 as described previously inFIG. 6 . User intent vectors 594 have associated personal information, such as ajob title 707 and an org (e.g., employer company)name 710. As explained above, users may provide personal information, such as employer name and job title in form fields when accessing aservice provider 118 or third party website. - The CCM tags 110 described previously capture and send the job title and employer name information to
CCM 100.CCM 100 stores the job title and employer information in the associated user profile 104.CCM 100 searches user profiles 104 and identifies threeuser intent vectors same employer name 710.CCM 100 determines thatuser intent vectors user intent vector 594C is associated with a job title of VP of finance. - In response to, or prior to,
search query 704,CCM 100 generates a companyintent vector 712A for company X.CCM 100 may generate companyintent vector 712A by summing up the topic relevancy values for all of the user intent vectors 594 associated with company X. - In response to
search query 704,CCM 100 identifies any company intent vectors 712 that include anelectric car topic 586 with a relevancy value greater than a given threshold. For example,CCM 100 may identify any companies with relevancy values greater than 4.0. In this example,CCM 100 identifies Org X in search results 706. - In one example, intent is identified for a company at a particular zip code, such as zip code 11201.
CCM 100 may take customer supplied offline data, such as from a Customer Relationship Management (CRM) database, and identify the users that match the company and zip code 11201 to create a segment. - In another example,
service provider 118 may enter aquery 705 asking which companies are interested in a document (DOC 1) related to electric cars.Computer 702 submitsquery 705 andDOC 1 toCCM 100.CCM 100 generates a topic vector forDOC 1 and compares theDOC 1 topic vector with all known companyintent vectors 712A. -
CCM 100 may identify an electric car topic in theDOC 1 with high relevancy value and identify company intent vectors 712 with an electric car relevancy value above a given threshold. In another example,CCM 100 may perform a vector cross product between theDOC 1 topics and different company intent vectors 712.CCM 100 may identify the names of any companies with vector cross product values above a given threshold value and display the identified company names in search results 706. -
CCM 100 may assignweight values 708 for different job titles. For example, an analyst may be assigned a weight value of 1.0 and a vice president (VP) may be assigned a weight value of 7.0. Weight values 708 may reflect purchasing authority associated withjob titles 707. For example, a VP of finance may have higher authority for purchasing electric cars than an analyst. Weight values 708 may vary based on the relevance of the job title to the particular topic. For example,CCM 100 may assign an analyst ahigher weight value 708 for research topics. -
CCM 100 may generate a weighted companyintent vector 712B based on weighting values 708. For example,CCM 100 may multiply the relevancy values foruser intent vectors user intent vector 594C by weighting value 3.0. The weighted topic relevancy values foruser intent vectors intent vector 712B. -
CCM 100 may aggregate together intent vectors for other categories, such as job title. For example,CCM 100 may aggregate together all the user intent vectors 594 with VP of finance job titles into a VP offinance intent vector 714.Intent vector 714 identifies the topics of interest to VPs of finance. -
CCM 100 may also perform searches based on job title or any other category. For example,service provider 118 may enter a query LIST VPs OF FINANCE INTERESTED IN ELECTRIC CARS? TheCCM 100 identifies all of the user intent vectors 594 with associated VPfinance job titles 707.CCM 100 then segments the group of user intent vectors 594 with electric car topic relevancy values above a given threshold value. -
CCM 100 may generatecomposite profiles 716.Composite profiles 716 may contain specific information provided by aparticular service provider 118 or entity. For example, afirst service provider 118 may identify a user as VP of finance and asecond service provider 118 may identify the same user as VP of engineering.Composite profiles 716 may includeother service provider 118 provided information, such as company size, company location, company domain. -
CCM 100 may use a firstcomposite profile 716 when providing user segmentation for thefirst service provider 118. The firstcomposite profile 716 may identify the user job title as VP of finance.CCM 100 may use a secondcomposite profile 716 when providing user segmentation for thesecond service provider 118. The secondcomposite profile 716 may identify the job title for the same user as VP of engineering.Composite profiles 716 are used in conjunction with user profiles 104 derived from other third party content. - In yet another example,
CCM 100 may segment users based on event type. For example,CCM 100 may identify all the users that downloaded a particular article, or identify all of the users from a particular company that registered for a particular seminar. 3. CONSUMPTION SCORING -
FIG. 8 depicts an exampleconsumption score generator 800 used inCCM 100. As explained above,CCM 100 may receivemultiple events 108 associated with different information objects 112. For example, users may use client apps (e.g., web browsers, or any other application) to access or view information objects 112 from different resources (e.g., on different websites). The information objects 112 may include any webpage, electronic document, article, advertisement, or any other information viewable or audible by a user such as those discussed herein. In this example, information objects 112 may include a webpage article or a document related to network firewalls. -
CCM tag 110 may captureevents 108 identifying information objects 112 accessed by a user during a network or application session. For example,events 108 may include various event data such as an identifier (ID) (e.g., a user ID (userId), an application session ID, a network session ID, a device ID, a product ID, electronic product code (EPC), serial number, RFID tag ID, and/or the like), URL, network address (NetAdr), event type (eventType), and a timestamp (TS). The ID field may carry any suitable identifier associated with a user and/or user device, associated with a network session, an application, an app session, an app instance, an app session, an app-generated identifier, and/or aCCM tag 110 may generated identifier. For example, when a user ID is used, the user ID may be a unique identifier for a specific user on a specific client app and/or a specific user device. Additionally or alternatively, the userId may be or include one or more of a user ID (UID) (e.g., positive integer assigned to a user by a Unix-like OS), effective user ID (euid), file system user ID (fsuid), saved user id (suid), real user id (mid), a cookie ID, a realm name, domain ID, logon user name, network credentials, social media account name, session ID, and/or any other like identifier associated with a particular user or device. The URL may be links, resource identifiers (e.g., Uniform Resource Identifiers (URIs)), or web addresses of information objects 112 accessed by the user during the session. - The NetAdr field includes any identifier associated with a network node. As examples, the NetAdr field may include any suitable network address (or combinations of network addresses) such as an internet protocol (IP) address in an IP network (e.g., IP version 4 (Ipv4), IP version 6 (IPv6), and/or the like), telephone numbers in a public switched telephone number, a cellular network address (e.g., international mobile subscriber identity (IMSI), mobile subscriber ISDN number (MSISDN), Subscription Permanent Identifier (SUPI), Temporary Mobile Subscriber Identity (TMSI), Globally Unique Temporary Identifier (GUTI), Generic Public Subscription Identifier (GPSI), and/or the like), an internet packet exchange (IPX) address, an X.25 address, an X.21 address, a port number (e.g., when using Transmission Control Protocol (TCP) or User Datagram Protocol (UDP)), a media access control (MAC) address, an Electronic Product Code (EPC) as defined by the EPCglobal Tag Data Standard, Bluetooth hardware device address (BD_ADDR), a Universal Resource Locator (URL), an email address, and/or the like. The NetAdr may be for a network device used by the user to access a network (e.g., the Internet, an enterprise network, and/or the like) and information objects 112.
- As explained previously, the event type may identify an action or activity associated with information objects 112. In this example, the event type may indicate the user downloaded an electric document or displayed a webpage. The timestamp (TS) may identify a date and/or time the user accessed information objects 112, and may be included in the TS field in any suitable timestamp format such as those defined by ISO 8601 or the like.
- Consumption score generator (CSG) 800 may access a NetAdr-
Org database 806 to identify a company/entity andlocation 808 associated withNetAdr 804 inevent 108. In one example, the NetAdr-Org database 806 may be a IP/company 806 when the NetAdr is a network address and the Orgs are entities such companies, enterprises, and/or the like. For example, existing services may providedatabases 806 that identify the company and company address associated with network addresses. The NetAdr (e.g., IP address) and/or associated org may be referred to generally as a domain.CSG 800 may generate metrics fromevents 108 for thedifferent companies 808 identified indatabase 806. - In another example, CCM tags 110 may include domain names in
events 108. For example, a user may enter an email address into a webpage field during a web session.CCM 100 may hash the email address or strip out the email domain address.CCM 100 may use the domain name to identify a particular company andlocation 808 fromdatabase 806. - As also described previously,
event processor 244 may generaterelevancy scores 802 that indicate the relevancy of information objects 112 withdifferent topics 102. For example, information objects 112 may include multiple words associate withtopics 102.Event processor 244 may calculaterelevancy scores 802 for information objects 112 based on the number and position words associated with a selected topic. -
CSG 800 may calculate metrics fromevents 108 forparticular companies 808. For example,CSG 800 may identify a group ofevents 108 for a current week that include thesame NetAdr 804 associated with a same company andcompany location 808.CSG 800 may calculate aconsumption score 810 forcompany 808 based on anaverage relevancy score 802 for the group ofevents 108.CSG 800 may also adjust theconsumption score 810 based on the number ofevents 108 and the number of unique users generating theevents 108. -
CSG 800 generatesconsumption scores 810 fororg 808 for a series of time periods.CSG 800 may identify asurge 812 inconsumption scores 810 based on changes inconsumption scores 810 over a series of time periods. For example,CSG 800 may identify surge 812 based on changes in content relevancy, number of unique users, number of unique user accesses for a particular information object, a number of events over one or more time periods (e.g., several weeks), a number of particular types of user interactions with a particular information object, and/or any other suitable parameters/criteria. It has been discovered thatsurge 812 corresponds with a unique period when orgs have heightened interest in a particular topic and are more likely to engage in direct solicitations related to that topic. The surge 812 (also be referred to as a “surge score 812” or the like) informs aservice provider 118 when target orgs (e.g., org 808) are indicating active demand for the products or services that are offered by theservice provider 118. -
CCM 100 may sendconsumption scores 810 and/or anysurge indicators 812 toservice provider 118.Service provider 118 may store acontact list 815 that includescontacts 818 for org ABC. For example,contact list 815 may include email addresses or phone number for employees of org ABC.Service provider 118 may obtaincontact list 815 from any source such as from a customer relationship management (CRM) system, commercial contact lists, personal contacts, third parties lead services, retail outlets, promotions or points of sale, or the like or any combination thereof. - In one example,
CCM 100 may sendweekly consumption scores 810 toservice provider 118. In another example,service provider 118 may haveCCM 100 only sendsurge notices 812 for companies onlist 815 surging forparticular topics 102. -
Service provider 118 may send information object 820 related to surge topics tocontacts 818. For example, theinformation object 820 sent byservice provider 118 tocontacts 818 may include email advertisements, literature, or banner ads related to firewall products/services. Alternatively,service provider 118 may call or send direct mailings regarding firewalls tocontacts 818. SinceCCM 100 identifiedsurge 812 for a firewall topic at org ABC,contacts 818 at org ABC are more likely to be interested in reading and/or responding tocontent 820 related to firewalls. Thus,content 820 is more likely to have a higher impact and conversion rate when sent tocontacts 818 of org ABC duringsurge 812. - In another example,
service provider 118 may sell a particular product, such as firewalls.Service provider 118 may have a list ofcontacts 818 at org ABC known to be involved with purchasing firewall equipment. For example,contacts 818 may include the chief technology officer (CTO) and information technology (IT) manager at org ABC.CCM 100 may send service provider 118 a notification whenever asurge 812 is detected for firewalls at org ABC.Service provider 118 then may automatically sendcontent 820 tospecific contacts 818 at org ABC with job titles most likely to be interested in firewalls. -
CCM 100 may also useconsumption scores 810 for advertising verification. For example,CCM 100 may compareconsumption scores 810 withadvertising content 820 sent to companies or individuals.Advertising content 820 with a particular topic sent to companies or individuals with a high consumption score or surge for that same topic may receive higher advertising rates. -
FIG. 9 shows a more detailed example of how theCCM 100 generates consumption scores 810.CCM 100 may receive millions ofevents 108 from millions of different users associated with thousands of different domains every day.CCM 100 may accumulate theevents 108 for different time periods, such as daily, weekly, monthly, or the like. Week time periods are just one example andCCM 100 may accumulateevents 108 for any selectable time period.CCM 100 may also store a set oftopics 102 for any selectable subject matter.CCM 100 may also dynamically generate some oftopics 102 based on the content identified inevents 108 as described previously. -
Events 108 as mentioned previously, and as shown byFIG. 9 , may include an identifier (ID) 950 (e.g., a user ID, session ID, device ID, product ID/code, serial number, and/or the like),URL 952,network address 954,event type 956, and timestamp 958 (which may be collectively referred to as “event data” or the like).Event processor 244 identifies information objects 112 located atURL 954 and selects one oftopics 102 for comparing with information objects 112.Event processor 244 may generate an associatedrelevancy score 802 indicating a relevancy of information objects 112 to selectedtopic 102.Relevancy score 802 may alternatively be referred to as a “topic score” or the like. -
CSG 800 generatesconsumption data 960 fromevents 108. For example,CSG 800 may identify or determine anorg 960A (e.g., “Org ABC” inFIG. 9 ) associated withnetwork address 954.CSG 800 also calculates arelevancy score 960C between information objects 112 and the selectedtopic 960B.CSG 800 also identifies or determines alocation 960D for withcompany 960A and identify adate 960E andtime 960F whenevent 108 was detected. -
CSG 800 generates consumption metrics 980 fromconsumption data 960. For example,CSG 800 may calculate a total number ofevents 970A associated withorg 960A (e.g., Org ABC) andlocation 960D (e.g., location Y) for some or all topics during a first time period (e.g., first week (week 1)), a second time period (e.g., second week (week 2)), and/or a third time period (e.g., third week (week 3)).CSG 800 also calculates the number of unique users 972A generating theevents 108 associated with org ABC andtopic 960B for the first week, the number of unique users 972B generating theevents 108 associated with org ABC andtopic 960B for the second week, and/or the number of unique users 972C generating theevents 108 associated with org ABC andtopic 960B for the third week. For example,CSG 800 may calculate, fortopic 960B, a total number of events generated by org ABC for the first week (e.g., topic volume 974A), for the second week (e.g., topic volume 974B), and/or for the third week (e.g., topic volume 974C).CSG 800 may also calculate an average topic relevancy scores 976A, 976B, and 976C for the content accessed by org ABC and associated withtopic 960B during the respective time periods.CSG 800 may generateconsumption metrics -
CSG 800 may generate consumption scores 910 (e.g.,consumption scores consumption metrics 980A-980C. For example,CSG 800 may generate afirst consumption score 910A forweek 1 and generate asecond consumption score 910B forweek 2 based in part on changes betweenconsumption metrics 980A forweek 1 andconsumption metrics 980B forweek 2.CSG 800 may generate athird consumption score 910C forweek 3 based in part on changes betweenconsumption metrics weeks surge 812. - Additionally or alternatively, the consumption metrics 980 may include metrics such as topic consumption by interactions, topic consumption by unique users, Topic relevancy weight, and engagement. Topic consumption by interactions is the number of interactions from an org in a given time period compared to a larger time period of historical data, for example, the number of interactions in a previous three week period compared to a previous 12 week period of historical data. Topic consumption by unique users refers to the number of unique individuals from an org researching relevant topics in a given time period compared to a larger time period of historical data, for example, the number of individuals from an org researching relevant topic in a previous three week period compared to a previous 12 week period of historical data. Topic relevancy weight refers to a measure of a content piece's ‘denseness’ in a topic of interest such as whether the topic is the focus of the content piece or sparsely mentioned in the content piece. Engagement refers to the depth of an org's engagement with the content, which may be based on an aggregate of engagement of individual users associated with the org. The engagement may be measured based on the user interactions with the information object such as by measuring dwell time, scroll velocity, scroll depth, and/or any other suitable user interactions such as those discussed herein.
-
FIG. 10 depicts a process for identifying a surge in consumption scores. Inoperation 1001, theCCM 100 identifies all domain events for a given time period. For example, for a current week theCCM 100 may accumulate all of the events for every network address (e.g., IP address, domain, or the like) associated with everytopic 102. - The
CCM 100 may use thresholds to select which domains to generate consumption scores. For example, for the current week theCCM 100 may count the total number of events for a particular domain (domain level event count (DEC)) and count the total number of events for the domain at a particular location (metro level event count (DMEC)). - The
CCM 100 calculates the consumption score for domains with a number of events more than a threshold (DEC>threshold). The threshold can vary based on the number of domains and the number of events. TheCCM 100 may use the second DMEC threshold to determine when to generate separate consumption scores for different domain locations. For example, theCCM 100 may separate subgroups of org ABC events for the cities of Atlanta, N.Y., and Los Angeles that have each a number of events DMEC above the second threshold. - In
operation 1002, theCCM 100 determines an overall relevancy score for all selected domains for each of the topics. For example, theCCM 100 for the current week may calculate an overall average relevancy score for all domain events associated with the firewall topic. - In
operation 1004, theCCM 100 determines a relevancy score for a specific domain. For example, theCCM 100 may identify a group ofevents 108 having a same network address associated with org ABC. TheCCM 100 may calculate an average domain relevancy score for the org ABC events associated with the firewall topic. - In
operation 1006, theCCM 100 generates an initial consumption score based on a comparison of the domain relevancy score with the overall relevancy score. For example, theCCM 100 may assign an initial low consumption score when the domain relevancy score is a certain amount less than the overall relevancy score. TheCCM 100 may assign an initial medium consumption score larger than the low consumption score when the domain relevancy score is around the same value as the overall relevancy score. TheCCM 100 may assign an initial high consumption score larger than the medium consumption score when the domain relevancy score is a certain amount greater than the overall relevancy score. This is just one example, and theCCM 100 may use any other type of comparison to determine the initial consumption scores for a domain/topic. - In
operation 1008, theCCM 100 adjusts the consumption score based on a historic baseline of domain events related to the topic. This is alternatively referred to as consumption. For example, theCCM 100 may calculate the number of domain events for org ABC associated with the firewall topic for several previous weeks. - The
CCM 100 may reduce the current week consumption score based on changes in the number of domain events over the previous weeks. For example, theCCM 100 may reduce the initial consumption score when the number of domain events fall in the current week and may not reduce the initial consumption score when the number of domain events rises in the current week. - In
operation 1010, theCCM 100 further adjusts the consumption score based on the number of unique users consuming content associated with the topic. For example, theCCM 100 for the current week may count the number of unique user IDs (unique users) for org ABC events associated with firewalls. TheCCM 100 may not reduce the initial consumption score when the number of unique users for firewall events increases from the prior week and may reduce the initial consumption score when the number of unique users drops from the previous week. - In
operation 1012, theCCM 100 identifies or determines surges based on the adjusted weekly consumption score. For example, theCCM 100 may identify a surge when the adjusted consumption score is above a threshold. -
FIG. 11 depicts in more detail the process for generating an initial consumption score. It should be understood this is just one example scheme and a variety of other schemes may also be used in other embodiments. - In
operation 1102, theCCM 100 calculates an arithmetic mean (M) and standard deviation (SD) for each topic over all domains. TheCCM 100 may calculate M and SD either for all events for all domains that contain the topic, or alternatively for some representative (big enough) subset of the events that contain the topic. TheCCM 100 may calculate the overall mean and standard deviation according to the following equations: -
- Equation 3.1 may be used to determine a mean and equation may be used to determine a standard deviation (SD). In equations 3.1 and 3.2, xi is a topic relevancy, and n is a total number of events.
- In
operation 1104, theCCM 100 calculates a mean (average) domain relevancy for each group of domain and/or domain/metro events for each topic. For example, for the past week theCCM 100 may calculate the average relevancy for org ABC events for firewalls. - In
operation 1106, theCCM 100 compares the domain mean relevancy (DMR) with the overall mean (M) relevancy and over standard deviation (SD) relevancy for all domains. For example, theCCM 100 may assign at least one of three different levels to the DMR as shown by table 3-1. -
TABLE 3-1 Low DMR < M − 0.5 * SD ~33% of all values Medium M − 0.5 * SD < DMR < M + 0.5 * SD ~33% of all values High DMR > M + 0.5 * SD ~33% of all values - In
operation 1108, theCCM 100 calculates an initial consumption score for the domain/topic based on the above relevancy levels. For example, for the current week theCCM 100 may assign one of the initial consumption scores shown by table 3-2 to the org ABC firewall topic. This just one example of how theCCM 100 may assign an initial consumption score to a domain/topic. -
TABLE 3-2 Relevancy Initial Consumption Score High 100 Medium 70 Low 40 -
FIG. 12 depicts one example of how theCCM 100 may adjust the initial consumption score. These are also just examples and theCCM 100 may use other schemes for calculating a final consumption score in other embodiments. Inoperation 1201, theCCM 100 assigns an initial consumption score to the domain/location/topic as described previously inFIG. 11 . - The
CCM 100 may calculate a number of events for domain/location/topic for a current week. The number of events is alternatively referred to as consumption. TheCCM 100 may also calculate the number of domain/location/topic events for previous weeks and adjust the initial consumption score based on the comparison of current week consumption with consumption for previous weeks. - In
operation 1202, theCCM 100 determines if consumption for the current week is above historic baseline consumption for previous consecutive weeks. For example, theCCM 100 may determine is the number of domain/location/topic events for the current week is higher than an average number of domain/location/topic events for at least the previous two weeks. If so, theCCM 100 may not reduce the initial consumption value derived inFIG. 11 . - If the current consumption is not higher than the average consumption in
operation 1202, theCCM 100 inoperation 1204 determines if the current consumption is above a historic baseline for the previous week. For example, theCCM 100 may determine if the number of domain/location/topic events for the current week is higher than the average number of domain/location/topic events for the previous week. If so, theCCM 100 inoperation 1206 reduces the initial consumption score by a first amount. - If the current consumption is not above than the previous week consumption in
operation 1204, theCCM 100 inoperation 1208 determines if the current consumption is above the historic consumption baseline but with interruption. For example, theCCM 100 may determine if the number of domain/location/topic events has fallen and then risen over recent weeks. If so, theCCM 100 inoperation 1210 reduces the initial consumption score by a second amount. - If the current consumption is not above than the historic interrupted baseline in
operation 1208, theCCM 100 inoperation 1212 determines if the consumption is below the historic consumption baseline. For example, theCCM 100 may determine if the current number of domain/location/topic events is lower than the previous week. If so, theCCM 100 inoperation 1214 reduces the initial consumption score by a third amount. - If the current consumption is above the historic base line in
operation 1212, theCCM 100 inoperation 1216 determines if the consumption is for a first-time domain. For example, theCCM 100 may determine the consumption score is being calculated for a new company or for a company that did not previously have enough events to qualify for calculating a consumption score. If so, theCCM 100 inoperation 1218 may reduce the initial consumption score by a fourth amount. - In one example, the
CCM 100 may reduce the initial consumption score by the following amounts. TheCCM 100 may use any values and factors to adjust the consumption score in other embodiments. - Consumption above historic baseline consecutive weeks (operation 1202). —0
- Consumption above historic baseline past week (operation 1204). —20 (first amount).
- Consumption above historic baseline for multiple weeks with interruption (operation 1208)—30 (second amount).
- Consumption below historic baseline (operation 1212). —40 (third amount).
- First time domain (domain/metro) observed (operation 1216). —30 (fourth amount).
- As explained above, the
CCM 100 may also adjust the initial consumption score based on the number of unique users. The CCM tags 110 inFIG. 8 may include cookies placed in web browsers that have unique identifiers. The cookies may assign the unique identifiers to the events captured on the web browser. Therefore, each unique identifier may generally represent a web browser for a unique user. TheCCM 100 may identify the number of unique identifiers for the domain/location/topic as the number of unique users. The number of unique users may provide an indication of the number of different domain users interested in the topic. - In
operation 1220, theCCM 100 compares the number of unique users for the domain/location/topic for the current week with the number of unique users for the previous week. TheCCM 100 may not reduce the consumption score if the number of unique users increases over the previous week. When the number of unique users decrease, theCCM 100 inoperation 1222 may further reduce the consumption score by a fifth amount. For example, theCCM 100 may reduce the consumption score by 10. - The
CCM 100 may normalize the consumption score for slower event days, such as weekends. Again, theCCM 100 may use different time periods for generating the consumption scores, such as each month, week, day, hour, and/or the like. The consumption scores above a threshold are identified as a surge or spike and may represent a velocity or acceleration in the interest of a company or individual in a particular topic. The surge may indicate the company or individual is more likely to engage with aservice provider 118 who presents content similar to the surge topic. The surge helpsservice providers 118 identify the orgs in active research mode for the service providers' 118 products/services so theservice providers 118 can proactively coordinate sales and marketing activities around orgs with active intent, and/or obtain or deliver better results with highly targeted campaigns that focus on orgs demonstrating intent around a certain topic. - One advantage of domain-based surge detection is that a surge can be identified for an org without using personally identifiable information (PII), sensitive data, or confidential data of the org personnel (e.g., company employees). The
CCM 100 derives the surge data based on an org's network address without using PII, sensitive data, or confidential data associated with the users generating theevents 108. - In another example, the user may provide PII, sensitive data, and/or confidential data during network/web sessions. For example, the user may agree to enter their email address into a form prior to accessing content. As described previously, the
CCM 100 may anonymize (e.g., hash, or the like) the PII, sensitive data, or confidential data and include the anonymized data either with org consumption scores or with individual consumption scores. -
FIG. 13 shows an example process for mapping domain consumption data to individuals. Inoperation 1301, theCCM 100 identifies or determines a surging topic for an org (e.g., org ABC at location Y) as described previously. For example, theCCM 100 may identify asurge 812 for org ABC in New York for firewalls. - In
operation 1302, theCCM 100 identifies or determines users associated with org ABC. As mentioned above, some org ABC personnel may have entered personal, sensitive, or confidential data, such as their office location and/or job titles into fields of webpages duringevents 108. In another example, aservice provider 118 or other party may obtain contact information for employees of org ABC from CRM customer profiles or third party lists. - Either way, the
CCM 100 orservice provider 118 may obtain a list of employees/users associated with org ABC at location Y. The list may also include job titles and locations for some of the employees/users. TheCCM 100 orservice provider 118 may compare the surge topic with the employee job titles. For example, theCCM 100 or service provider may determine that the surging firewall topic is mostly relevant to users with a job title such as engineer, chief technical officer (CTO), or information technology (IT). - In
operation 1304, theCCM 100 orservice provider 118 maps the surging topic (e.g., firewall in this example) to profiles of the identified personnel of org ABC. In another example, theCCM 100 orservice provider 118 may not be as discretionary and map the firewall surge to any user associated with org ABC. TheCCM 100 or service provider then may direct content associated with the surging topic to the identified users. For example, the service provider may direct banner ads or emails for firewall seminars, products, and/or services to the identified users. - Consumption data identified for individual users is alternatively referred to as “Dino DNA” and the general domain consumption data is alternatively referred to as “frog DNA.” Associating domain consumption and surge data with individual users associated with the domain may increase conversion rates by providing more direct contact to users more likely interested in the topic.
- The example embodiments described herein provide improvements to the functioning of computing devices and computing networks by providing specific mechanisms of collecting
network session events 118 from user devices (e.g.,computers FIGS. 2 and 14 , andplatform 3900 ofFIG. 39 ), accessing information objects 112, 114, determining the amount of traffic individual websites receive from user devices at or related to a specific domain name or network addresses at specific periods of time, and identifying spikes (surges 812). The collected data can be used to analyze the cause of the surge (e.g., relevant topics in specific information objects 112, 114), which provides a specific improvement over prior systems, resulting in improved network/traffic monitoring capabilities and resource consumption efficiencies. The embodiments discussed herein allows for the discovery of information from extremely large amounts of data that was not previously possible in conventional computing architectures. - Identifying spikes (e.g., surges) in traffic in this way allows content providers to better serve their content to specific users. Serving content to numerous users (e.g., responding to network request for content and the like) without targeting can be computationally intensive and can consume large amounts of computing and network resources, at least from the perspective of content providers, service providers, and network operators. The improved network/traffic monitoring and resource efficiencies provided by the present claims is a technological improvement in that content providers, service providers, and network operators can reduce network and computational resource overhead associated with serving content to users by reducing the overall amount of content served to users by focusing on the relevant content. Additionally, the content providers, service providers, and network operators could use the improved network/traffic monitoring to better adapt the allocation of resources to serve users a peak times in order to smooth out their resource consumption over time.
-
FIG. 14 depicts howCCM 100 may calculate consumption scores based on user engagement. Acomputer 1400 may operate a client app 1404 (e.g., a browser, desktop/mobile app, and/or the like) to access information objects 112, for example, by sending appropriate HTTP messages or the like, and in response, server-side application(s) may dynamically generate and provide code, scripts, markup documents, and/or other information object(s) 112 to theclient app 1404 to render and display information objects 112 within theclient app 1404. As alluded to previously, information objects 112 may be a webpage or web app comprising a graphical user interface (GUI) including graphical control elements (GCEs) for accessing and/or interacting with a service provider (e.g., a service provider 118). The server-side applications may be developed with any suitable server-side programming languages or technologies, such as PHP; Java™ based technologies such as Java Servlets, JavaServer Pages (JSP), JavaServer Faces (JSF), and/or the like; ASP.NET; Ruby or Ruby on Rails; a platform-specific and/or proprietary development tool and/or programming languages; and/or any other like technology that renders HyperText Markup Language (HTML). Thecomputer 1400 may be a laptop, smartphone, tablet, and/or any other device such as any of those discussed herein. In this example, a user may open theclient app 1404 on ascreen 1402 ofcomputer 1400. -
CCM tag 110 may operate withinclient app 1404 and monitor user web sessions. As explained previously,CCM tag 110 may generateevents 108 for the web/network session that includes various event data 950-958 such as an ID 950 (e.g., a user ID, session ID, app ID, and/or the like), aURL 952 for accessed information objects 112, anetwork address 954 of a user/user device that accessed the information objects 112, anevent type 956 that identifies an action or activity associated with the accessed information objects 112, and timestamp 958 of theevents 108. For example,CCM tag 110 may add an event type identifier intoevent 108 indicating the user downloaded aninformation object 112. In some embodiments, theevents 108 may include also include an engagement metrics (EM)field 1410 to include engagement metrics (the data field/data element that carries engagement metrics, and the engagement metrics themselves may be referred to herein as “engagement metrics 1410” or “EM 1410”) - In one example,
CCM tag 110 may generate a set of impressions, which is alternatively referred to asengagement metrics 1410, indicating actions taken by the user while consuming information objects 112 (e.g., user interactions). For example,engagement metrics 1410 may indicate how long the user dwelled on information objects 112, how the user scrolled through information objects 112, and/or the like.Engagement metrics 1410 may indicate a level of engagement or interest a user has in information objects 112. For example, the user may spend more time on the webpage and scroll through webpage at a slower speed when the user is more interested in the information objects 112. - In embodiments, the
CCM 100 calculates an engagement score 1412 for information objects 112 based onengagement metrics 1410.CCM 100 may use engagement score 1412 to adjust arelevancy score 802 for information objects 112. For example,CCM 100 may calculate a larger engagement score 1412 when the user spends a larger amount of time carefully paging through information objects 112.CCM 100 then may increaserelevancy score 802 of information objects 112 based on the larger engagement score 1412.CSG 800 may adjustconsumption scores 910, 810 based on the increasedrelevancy score 802 to more accurately identify domain surge topics. For example, a larger engagement score 1412 may produce alarger relevancy score 802, which in turn produces alarger consumption score 910, 810. As mentioned previously, theCCM 100 can sendconsumption scores 810 toservice provider 118, and theservice provider 118 may store thecontact list 815 that includescontacts 818 for org ABC. Theservice provider 118 can then send aninformation object 820 related to surgetopics 102A to the indicatedcontacts 818. -
FIG. 15 depicts an example process for calculating the engagement score for content. Inoperation 1520, theCCM 100 identifies or determinesengagement metrics 1410 for information objects 112. In embodiments, theCCM 100 may receiveevents 108 that includecontent engagement metrics 1410 for one or more information objects 112. Theengagement metrics 1410 for information objects 112 may be content impressions or the like. As examples, theengagement metrics 1410 may indicate any user interaction with information objects 112 including key presses, action selections, timer values and/or timer expiration indicators, tab selections that switch to different pages, page movements, mouse page scrolls, mouse clicks, mouse movements, scroll bar page scrolls, keyboard page movements, touch screen page scrolls, eye tracking data (e.g., gaze locations, gaze times, gaze regions of interest, eye movement frequency, speed, orientations, and/or the like), touch data (e.g., touch gestures, and/or the like), and/or any other content movement or content display indicator(s). - In
operation 1522, theCCM 100 identifies or determines engagement levels based on theengagement metrics 1410. In one example atoperation 1522, theCCM 100 identifies/determines a content dwell time. The dwell time may indicate how long the user actively views a page of content. In one example, tag 110 may stop a dwell time counter when the user changes page tabs or becomes inactive on a page.Tag 110 may start the dwell time counter again when the user starts scrolling with a mouse or starts tabbing. Additionally or alternatively atoperation 1522, theCCM 100 identifies/determines, from theevents 108, a scroll depth for the content. For example, theCCM 100 may determine how much of a page the user scrolled through or reviewed. In one example, theCCM tag 110 orCCM 100 may convert a pixel count on the screen into a percentage of the page. Additionally or alternatively atoperation 1522, theCCM 100 identifies/determines an up/down scroll speed. For example, dragging a scroll bar may correspond with a fast scroll speed and indicate the user has less interest in the content. Using a mouse wheel to scroll through content may correspond with a slower scroll speed and indicate the user is more interested in the content. Additionally or alternatively atoperation 1522, theCCM 100 identifies/determines various other aspects/levels of the engagement based on some or all of theengagement metrics 1410 such as any of those discussed herein. In some embodiments, theCCM 100 may assign higher values to engagement metrics 1410 (e.g., impressions) that indicate a higher user interest and assign lower values to engagement metrics that indicate lower user interest. For example, theCCM 100 may assign a larger value inoperation 1522 when the user spends more time actively dwelling on a page and may assign a smaller value when the user spends less time actively dwelling on a page. - In
operation 1524, theCCM 100 calculates the content engagement score 1412 based on the values derived in operations 1520-1522. For example, theCCM 100 may add together and normalize the different values derived in operations 1520-1522. Other operations may be performed on these values in other embodiments. - In operation 1526, the
CCM 100 adjusts relevancy values (e.g., relevancy scores 802) described previously inFIGS. 1-14 based on the content engagement score 1412. For example, theCCM 100 may increase the relevancy values (e.g., relevancy scores 802) when the information object(s) 112 has/have a high engagement score and decrease the relevancy (e.g., relevancy scores 802) for a lower engagement scores. -
CCM 100 orCCM tag 110 inFIG. 14 may adjust the values assigned in operations 1520-1524 based on the type ofdevice 1400 used for viewing the content. For example, the dwell times, scroll depths, and scroll speeds, may vary between smartphone, tablets, laptops and desktop computers.CCM 100 ortag 110 may normalize or scale the engagement metric values so different devices provide similar relative user engagement results. - By providing more accurate intent data and consumptions scores in the ways discussed herein allows
service providers 118 to conserve computational and network resources by providing a means for better targeting users so that unwanted and seemingly random content is not distributed to users that do not want such content. This is a technological improvement in that it conserves network and computational resources ofservice providers 118 and/or other organizations (orgs) that distribute this content by reducing the amount of content generated and sent to end-user devices. End-user devices may reduce network and computational resource consumption by reducing or eliminating the need for using such resources to obtain (download) and view unwanted content. Additionally, end-user devices may reduce network and computational resource consumption by reducing or eliminating the need to implement spam filters and reducing the amount of data to be processed when analyzing and/or deleting such content. - Furthermore, unlike conventional targeting technologies, the embodiments herein provide user targeting based on surges in interest with particular content, which allows
service providers 118 to tailor the timing of when to send content to individual users to maximize engagement, which may include tailoring the content based on the determined locations. This allows content providers to spread out the content distribution over time. Spreading out content distribution reduces congestion and overload conditions at various nodes within a network, and therefore, the embodiments herein also reduce the computational burdens and network resource consumption on thecontent providers 118, content distribution platforms, and Internet Service Providers (ISPs) at least when compared to existing/conventional mass/bulk distribution technologies. -
FIG. 16 shows an example network address classification system (NACS) 1600 (also referred to as “network classifier 1600”). TheNACS 1600 identifies different types of entities associated with various network addresses 954. To do so, theNACS 1600 leverages the fact that network addresses 954 may be associated with different physical locations. For example, afirst network device 1614A may have an associatednetwork address 954A (NA1) and may be associated with aprivate home location 1610A; asecond network device 1614B may havenetwork address 954B (NA2) and may be associated with a public org location 1610B; and athird network device 1614C may havenetwork address 954C (NA3) and may be associated with a private org location 1610C. As examples, thenetwork devices 1614 may be routers, mesh networking devices, switches, hubs, network appliances, gateway appliances, a computing device (e.g., laptop, tablet, smartphone, and/or the like) acting as a WiFi or mobile hotspot, and/or any other networking device used to connect one or more other devices to a network. - For explanatory purposes,
private home location 1610A may refer to any location associated with a relatively small group of people, such as a private residence. In at least one example, content (information objects 112) accessed by users atprivate home location 1610A may not necessarily be associated with a company. For example, persons living atprivate home location 1610A may work for companies and may view work related content fromprivate home location 1610A. However, it may be unlikely that the majority of content accessed by users atprivate home location 1610A are associated with a same company. - Public org location 1610B may be associated with any entity, establishment, building, event, location, and/or the like that caters to multiple users that are not necessarily employed, or otherwise associated, with the same company, entity, establishment, and/or the like. For example, public org location 1610B may be a coffee shop run by a company that sells coffee to the general public. Content accessed by the different users at coffee shop location 1610B may not necessarily be associated with the coffee company that operates the coffee shop. For example, users entering
coffee shop location 1610A may work for a variety of different companies and may view a variety of different content unrelated to the coffee company. - Private org location 1610C may be associated with any entity, establishment, building, event, location, and/or the like where multiple users work, are employed, or are otherwise associated with the same business, entity, or establishment. For example, private org location 1610C may be the corporate offices of the coffee company that runs coffee shop location 1610B. In another example, private org location 1610C may be the corporate offices of an entertainment or casino company that operates an amusement park and/or casino at public org location 1610B.
- In other examples, the entities associated with locations 1610B and 1610C are unrelated. For example, the company at private org location 1610C may not have retail stores or facilities. In at least in one example, users at private org location 1610C may mostly work for the same company and may mostly view content related to their jobs at the same company.
- As described previously,
tags 110 monitor information objects 112 accessed by computingdevices 230 at the different network address (NA) locations 1610.Tags 110 generateevents 108 that identify different parameters of the content accessed by the users at NA locations 1610. As mentioned previously,events 108 may include a user ID, URL, network address, event type, and timestamp. In embodiments, theevents 108 may also include a device type and a time offset. - As shown, the
network classifier 1600 includes anetwork feature generator 1602 and anetwork entity classifier 1606. Thenetwork feature generator 1602 identifies the source network addresses 954 in network messages (e.g., IP messages, HTTP messages, and/or the like) sent fromtags 110 toCCM 100, and determines, generates, and/or identifies different machine learning (ML) features 1604 (or simply “features of 1604”) related to theevents 108 generated at/by the different network address locations 1610. For example,feature generator 1602 may determine the average amount of content each user accesses at the different locations 1610, the average amount of time users access content at the different locations 1610, and when users access content at the different locations 1610.Feature generator 1602 may also determine what types ofcomputing devices 230 are used for accessing content at the different locations 1610.Other features 1604 may be extracted from the event data in other embodiments. - The
entity classifier 1606 usesfeatures 1604 to determine types of establishments associated with respective locations 1610. For example, features 1604 may indicate a relatively small number of users access content ataddress location 1610A. Theentity classifier 1606 may accordingly identifynetwork address 954A as a home location. - Additionally or alternatively, the
entity classifier 1606 may determine fromfeatures 1604 that a relatively large number of users access content consistently throughout the day and on weekends at location 1610B. Theentity classifier 1606 may also determine fromfeatures 1604 that most of the users at location 1610B use smart phones to access content.Network entity classifier 1606 may determinenetwork address 954B is associated with a public org location. -
Network entity classifier 1606 may determine fromfeatures 1604 that users at NA location 1610C mostly access content during business hours Monday through Friday.Network entity classifier 1606 may also determine fromfeatures 1604 that most of the users at location 1610C use personal computers or laptop computers to access content.Network entity classifier 1606 may determinenetwork address 954C is associated with a private org location. -
Network entity classifier 1606 may generate anNetAdr entity map 1608 thatCCM 100 usesIP entity map 1608 to more efficiently and effectively generate consumption scores and identify surges for different companies. For example,CCM 100 may distinguish between multiple network addresses owned by the same company that include both public org locations and private org locations. In another example,CCM 100 may identify multiple different companies operating within in a shared office space location. - In some embodiments, the
network classifier 1600 may be part of or otherwise operate in theCCM 100. In other embodiments, thenetwork classifier 1600 may be a separate network function or network element from theCCM 100 that provides theCCM 100 with entity/org predictions based onnetwork events 118 and/or other collected or obtained data. - In either embodiment, the
CCM 100 may generate different consumption scores 910 (see e.g.,FIGS. 8 and 9 ) for the different network address locations 1610 or may only provide consumption scores 910 for network addresses associated with private org locations 1610C. In another example, someservice providers 118 may be more interested in consumption scores 910 for a certain demographic of users that patronize retail locations 1610B of particular businesses.CCM 100 may generate consumption scores 910 and related surge data 412 for theevents 108B associated with public org locations 1610B for those businesses. Thus,CCM 100 can filter out events that are less likely to indicate the current interests of associated businesses, business customers, or any other definable entity or demographic. - In embodiments, the
NetAdr classification system 1600 generatesnetwork entity map 1608 without using personal identification information (PII), sensitive data, and/or confidential data.Events 108 may include a user identifier 950 (see e.g.,FIG. 9 ). However,classification system 1600 can classify network address locations 1610 based only on event timestamps and/or device types. Thus,CCM 100 can generate consumption scores for particular company network addresses without using PII data.Service providers 118 are then free to use their own contact lists to send content to particular companies based on the associated company consumption scores. -
FIG. 17 shows in more detail howNACS 1600 identifies the types of establishments associated with network address locations. As described previously,events 108 may include a user identifier (ID) 950 such as a unique cookie ID or the like, aURL 952 identifying content accessed by a user associated withuser ID 950, anetwork address 954, and atimestamp 958.Events 108 may also includedevice type 1759 and time offset 1761 fields that include device type data and a time offset, respectively. - Network address 954 (e.g., an IP address or some other network address) may be the network address of the
network device 1614 at the physical location wheretags 110 generateevents 108.Tags 110 may send network messages toCCM 100 on a periodic basis (e.g., every 15 seconds or the like) or in response to a trigger (e.g., when a network session event or user interaction takes place) vianetwork device 1614. The messages containevents 108 and include a source network address fornetwork device 1614 thatCCM 100 uses to send acknowledgement messages back totags 110. -
Tags 110 may discoverdevice type 1759 of thecomputing device 230 that the user uses to access information objects 112. For example, tags 110 may identifycomputing device 230 as a personal computer, laptop, tablet, or smartphone based on the client app screen resolution, type of client app (e.g., web browser) used for viewing information objects 112, or a type of user agent used by the client app (e.g., web browser). -
Tags 110 may also add a time offset 461 corresponding with the time zone associated withevents 108.Classification system 1600 can adjust alltimestamps 958 from all network address locations to correspond to a same universal time. - The
feature generator 1602 may produce afeature dataset 1712 including a variety of different features 1604 (e.g., features F1 to FZ inFIG. 17 where Z is a number) for eachnetwork address 954 based on any combination of parameters inevents 108. As described previously,feature generator 1602 may generate somefeatures 1604 based ontimestamps 958 and/ordevice type 1759. In one example,feature generator 1602 may generate anew feature dataset 1712 each day, or over some other selectable time period.Several features 1604 have been described previously andadditional features 1604 are described infra in more detail. -
Entity classifier 1606 uses anetwork classification model 1718 to identify types of establishments associated with network addresses 954. In one example,classification system 1600 uses a logistic regression model 1718 (also referred to as a “logit model 1718”) as follows: -
N −1 log L(θ|x)=N −1Σi=1 N log Pr(y i |x i;θ) [Equation 6.1] - In equation 6.1, N is a number of observations; L is loss function; θ is parameters/coefficients used to calculate probability; Pr is probability; yi is class (0 or 1) of the ith observation, and xi is a vector of features representing a network address and/or network device (e.g., features 1604) (xi may be referred to as a “feature vector”). Additionally, the notation/symbol “|” indicates a conditional probability, for example, “Pr(yi|xi; θ)” is the probability of yi given that the event xi; θ occurs. In some embodiments, the
logistic regression model 1718 is maximized using one or more suitable optimization techniques such as gradient descent and/or the like. Other logistic regression models may be used in other embodiments. Other ML models for identifying different behavior patterns may be used in other embodiments such as any of the classification or other supervised learning ML models/algorithms discussed herein. - The
classification system 1600trains model 1718 with training data 1716. In one example implementation, a first set oftraining data 1716A may includefeatures 1604 for network addresses 954 from known private org locations. For example, training data 1710A may be produced fromevents 108 generated from the known corporate headquarters or known business offices of companies. In one example implementation, a second set of training data 1716B may includefeatures 1604 for network addresses from known public org locations or known non-org locations. For example, training data 1716B may be generated from coffee shops, retail stores, amusement parks, internet service providers, private homes, or any other publicly accessible Internet location. - In one example,
model 1718 uses training data 1716 to identifyfeatures 1604 associated with private org locations. However,model 1718 may be trained to identify any other type of physical location based on network address (e.g., IP address or the like), such as public org locations, private home locations, geographic locations, GPS coordinates, contextual location, and/or any other business or user demographic. -
Classification system 1600 feeds features 1604 for aparticular network address 954 into trainedmodel 1718.Model 1718 generatesprediction values 1720 that indicate the probability of the associated network address being a private org location. For example, classification system 1700 may identify anynetwork address 954 with aprediction score 1720 over 0.45 as a private org location. Conversely,classification system 1600 may identify anynetwork address 954 with aprediction score 1720 less than some other threshold as a public org location or a private home location.Classification system 1600 generatesnetwork entity map 1608 inFIG. 16 from prediction values 1720. For example, thenetwork entity map 1608 may include a row for eachnetwork address 954 and a column marked if the associated network address is identified as a private org location. -
FIG. 18 shows howclassification system 1600 generates consumption scores 1810. As described previously,classification system 1600 identifies the types of orgs/establishments associated with different network addresses. In this example,classification system 1600 classifies network addresses 1822 as private org locations (NA-pOrg). The classified network address are stored inNetAdr entity map 1608. - As explained previously, a domain name service may provide a
database 806 that identifies companies and company addresses associated with different network addresses. The network address and/or associated entity may be referred to generally as a domain. Additionally,database 806 may include multiple different network addresses associated with the same org. Some of these network addresses may be associated with public org locations that do not necessarily identify the intent or interests of the org. -
CCM 100 may receive a group of events having thesame network address 1824. To generate moreaccurate consumption scores 1810,CSG 800 may compare thenetwork address 1824 associated with the group ofevents 108 withNetAdr entity map 1608.Map 1608 indicates inoutput 1826 if thenetwork address 1824 is associated with a private org location or some other entity location. Ifnetwork address 1824 is not associated with a private org location,CSG 800 may not generate aconsumption score 1810. Ifoutput 1826 indicatesnetwork address 1824 is associated with a private org location (NA-pOrg),CSG 800 may generate aconsumption score 1810 for the identified company andlocation 1808. - In addition, the
consumption score 1810 may also be calculated in a similar manner as discussed previously with respect toconsumption score 810. For example, theCSG 800 calculates aconsumption score 1810 fromevents 108 that includes thenetwork address 1824 verified as associated with a private org location. As explained previously with respect toFIGS. 8-9 , theCSG 800 may generateconsumption score 1810 for atopic 102 based on an average topic relevancy score 976 for the group ofevents 108.CSG 800 may adjustconsumption score 1810 based on the total number of events 970, number of unique users 972, and topic volume 974 as described previously inFIG. 9 . In this example, theentity classification system 1600 and theCCM 100 as being separate entities, however, in other implementations, theentity classification system 1600 may be part of (or within) theCCM 100. -
Entity classification system 1600 may continuously updatenetwork entity map 1608 andCSG 800 may continuously confirm which receivednetwork addresses 1824 are associated with private org locations.GSG 800 may stop generating consumption scores 910 for any network addresses 1824 that are no longer associated with private org locations. By filtering out events from public org locations and non-org locations,CCM 100 may more accurately identify topics of interest and surges for particular types of orgs (e.g., businesses or the like). - As mentioned above,
CCM 100 may sendconsumption scores 1810 and/or anysurge information 1812 for an org associated withnetwork address 1824 toservice provider 118.Service provider 118 may store acontact list 1815 includingcontacts 818 for org XYZ.Service provider 118 may sendinformation object 1820 related totopic 102A tocontacts 1818 whenconsumption data 1810 identifies asurge 1812. In another example,CCM 100 may tag the profiles of users associated with the identified businesses/entity 1808.CCM 100 them may accumulate all of the user intent vectors associated with the same company as described previously. -
FIG. 19 shows examples of features generated byfeature generator 1602 fromevents 108 andFIG. 20 shows the associated operations performed byfeature generator 1602. As described previously, any number of features/metrics can be generated fromevents 108 and then used by theclassification system 1600 to classify network addresses.Feature generator 1602 may generatefeatures 1604 over any programmable time period, such as daily. - Referring to
FIGS. 19 and 20 , inoperation 2001feature generator 1602 receivesevents 108 that include associated network addresses 954A-954D.Feature generator 1602 may generatefeatures 1604A-1604E using all of theevents 108 received during that day that include the same network address.Feature generator 1602 may generate somefeatures 1604 as mean values, average values, ratios, percentages, normalized values, and/or the like. The actual values shown infeature dataset 1712 are just examples and may vary based on the specific calculations used byfeature generator 1602. - In
operation 2002,feature generator 1602 calculates afeature 1604A that identifies a mean total number of events generated at eachnetwork address 954 for a desired time period (e.g., during each day). For example,feature generator 1602 may calculate the mean total events generated by each user from the network address per day.Feature 1604A may help distinguish network addresses associated with orgs (e.g., businesses, enterprises, and/or the like) from other network addresses associated with individuals or other non-org entities. - In
operation 2003,feature generator 1602 generates afeature 1604B that identifies a ratio ofevents 108 generated during operating time periods (e.g. working hours) vs.events 108 generated during non-operating time periods (e.g. non-working hours). For example,feature generator 1602 may calculate the mean number of events generated for each user for a certain time period (e.g., between 8 am-6 pm) compared with all other hours.Feature 1604B may help distinguish network addresses associated with private org locations where users generally access content during business hours from network addresses associated with other public org locations where users may access content any time of the day. - In
operation 2004,feature generator 1602 generates afeature 1604C that identifies a percentage of events generated on weekends and/or other non-operational times/dates.Feature 1604C also helps distinguish network addresses associated with private org locations where users generally access content during work days from other public org locations and private home locations where users may access a higher percentage of content during the weekends. - In
operation 2005,feature generator 1602 generates afeature 1604D that identifies the amount of time users actively access information objects from the network address.Feature generator 1602 may identify the first time a particular user accesses information objects at the network address during the day and identify the last time particular the user accesses content at the same network address during that day.Feature 1604D may help distinguish private org locations where users generally access different content/information objects throughout the day at the same org location vs. public org locations where users may only access content/information objects for a short amount of time while purchasing a product, such as coffee at a coffee shop. -
Feature generator 1602 may extend theactive time 1604D as long as the user accesses some content within some time period. In another example,feature generator 1602 may terminate active time periods when the user does not access content for some amount of time.Feature generator 1602 then may identify the longest or average active time periods for each user and then calculate an average active time for all users that access content at thenetwork address 954. Many users at public org locations, such as a coffee shop, may have zero duration events since the user may only generate one event at that network address. - In
operation 2006,feature generator 1602 generates afeature 1604E that identifies the percentage of information objects accessed by different device types. In one example, thefeature generator 1602 generates afeature 1604E that identifies the percentage of information objects accessed by users with mobile devices, such as cell phones, tablets, laptops, wearables, and/or the like. In this example, thefeature 1604E may help distinguish private org locations where users mostly use personal computers or laptops from public org locations where users may more frequently access content with cell phones. - In operation 2007,
feature generator 1602 generates afeature 1604E that identifies a percentage of time users are active at a particular network address vs. other network addresses. This may also help distinguish private org locations where users generally spend more time accessing content vs. public org locations where users may spend less time accessing content. In another example,feature generator 1602 may identify the average number of users that have accessed the same network address over a week. A public org location may have a larger number of users access the network address over a week. - Example features used in the ML model according to the various embodiments discussed herein may include, but are not limited to, the any combination of the features in Table 6.1.
-
TABLE 6-1 Feature Description ip_p_during_business The percent of an IPs activity that happens during business hours. “Business hours” being defined as 8am-6pm M-F. For example, an IP that is active 24/7 may have a value of 0.30. A business active 24 hours a day during M-F may have a value of 0.42 mean_profile_p_during_business_global This feature looks at the average percentage of activity during business hours of the profiles that have visited this network address. This feature is different than ‘ip_p_during_business’ because it aggregates over the global behavior over profiles at the IP rather than only the profile at the IP. mean_dow_active_global An average over the profiles at an IP of how many days of the week they are active globally (i.e. across all IPs). For example, if there are two profiles at an IP, and one has been active 7 days (even if not at this IP for all 7 days) and another active for only 2 days the value may be 4.5. mean_dow_active_at_ip An average over the profiles at an IP of how many days of the week each profile is active only at the specific IP. So even if a user was active 7 days globally, but only 1 day at this IP, then only that 1 day would be considered mean_percent_weekday_at_ip An average over the profiles of what percentage of their activity happened at the specific network address during the week. For example, if all of a profile's traffic was Wednesday and Friday, their individual percent weekday would be 1. This feature is the mean of this metric for all profiles at a network address mean_avg_start_hour_global Averages across profiles at an IP the hour, in local time, of the profile's average first activity globally mean_avg_end_hour_global Averages across profiles at an IP the hour, in local time, of the profile's average last activity globally mean_avg_start_hour_at_ip Averages across profiles at an IP the hour, in local time, of the profile's average first activity only at the specific IP mean_avg_end_hour_at_ip Averages across profiles at an IP the hour, in local time, of the profile's average last activity only at the specific IP mean_avg_duration_at_ip For each profile at the IP, it takes the average “duration” of activity for each profile. The “duration” is defined as the last timestamp-first timestamp. This means that a profile with a single event will have a duration of 0. The duration of each day for each profile is averaged, then the average of all profiles is taken to provide the value for this feature mean_avg_duration_ratio this is the ratio of the ‘duration_at_ip’ and the ‘duration_global’ averaged per profile then averaged across all these profiles mean_pages_visited_ratio the ratio of pages viewed at this IP over the pages viewed globally per profile, averaged across all profiles mean_dow_active_ratio the ratio of days of week active at this IP over the days of week active globally, averaged across all profiles mean_avg_start_hour_diff the feature looks at the difference between when a profile starts at the IP and globally, then averages this difference for each profile for the entire period then takes the average across all profiles mean_profile_p_during_business_ratio average ratio of the percentage of profile activity that happens at the IP vs globally mean_avg_end_hour_diff the feature looks at the difference between when a profile ends at the IP and globally, then averages this difference for each profile for the entire period then takes the average across all profiles mean_p_sunday_evts_at_ip Average over profiles at IP of what percentage of their hours are on Sunday mean_p_monday_evts_at_ip Average over profiles at IP of what percentage of their hours are on Monday mean_p_tuesday_evts_at_ip Average over profiles at IP of what percentage of their hours are on Tuesday mean_p_wednesday_evts_at_ip Average over profiles at IP of what percentage of their hours are on Wednesday mean_p_thursday_evts_at_ip Average over profiles at IP of what percentage of their hours are on Thursday mean_p_friday_evts_at_ip Average over profiles at IP of what percentage of their hours are on Friday mean_p_saturday_evts_at_ip Average over profiles at IP of what percentage of their hours are on Saturday mean_avg_daily_pages_visited Looks at the average number of pages a profile visits at the IP per day, then averages these across all profiles at the IP percent_mobile Percentage of traffic from IP that has the device type of mobile (note: only non-null values are used for this calculation). Here, mobile device types may include smartphones, handheld gaming consoles, portable music players, personal navigation devices, digital cameras, and/or the like. percent_tablet Percentage of traffic from IP that has the device type of tablet (note: only non-null values are used for this calculation) percent_wearable Percentage of traffic from IP that has the device type of wearable (note: only non-null values are used for this calculation). Wearable device types may include smart watches, head-mounted devices (e.g., smart glasses), activity/fitness trackers, and/or the like. percent_laptop Percentage of traffic from IP that has the device type of laptop (note: only non-null values are used for this calculation). Laptop device types may include traditional laptops, laptop/tablet hybrids (e.g., 2-in-1s), netbooks, and/or the like. percent_desktop Percentage of traffic from IP that has the device type of desktop (note: only non-null values are used for this calculation). Desktop device types may include consumer- oriented personal computers, workstations, video game consoles, and/or the like normalized_entropy This is a Shannon entropy of profile_atr_domain for the network address. For example, it represents how much confusion there is in the profile_atr_domains for the IP. The Shannon entropy is then divided by the maximum possible entropy yielding a value between [0.0, 1.0] which is the normalized entropy. Note that this value can be NaN when the Shannon entropy is 0. This has the interpretation that the normalized entropy should be 0 profile_events_ratio This feature compares the number of events generated by each profile at an IP, on average. In one example, the IP might have short-lived users, who generate an average of two events, thus this feature would have a value of 0.5. In another example, an IP might have many business users, who generate on average 10 events, resulting in this feature taking a value of 0.1. To note, the range for this feature is from (0, 1] unlike more intuitive reciprocals which range from [1, infty) ua_events_ratio This feature is similar to profile_events_ratio only it uses the number of unique user agents instead of profiles log10_mean_ips_visited The log10 transform of the average number of network addresses visited by each profile at this IP log10_mean_pages_visited_global The log10 transform of the average number of pages visited/viewed globally by profiles that have been at this network address log10_mean_pages_visited_at_ip The log10 transform of the average number of pages visited/viewed globally by profiles that have been at this network address log10_mean_avg_daily_ips_visited This is the same as log10_mean_ips_visited only it first averages over the daily IPs visited per profile - The features described by Table 6.1 may correspond to the
features 1604 of thefeature dataset 1712.Feature generator 1602 may identify anyother feature 1604 that indicates how users may access content at different network address locations. As explained previously,NACS 1600 usesfeature dataset 1712 to then identify the different types of establishments associated with different network addresses. - Beyond predicting whether or not a network address behaves like a business,
NACS 1600 can make other inferences about the type of physical location (e.g., hotel, coffee shop, hospital, and/or the like) or underlying application or process (e.g., mobile network operator, university, botnet, proxy) the network address supports. For instance,NACS 1600 may infer additional firmographic attributes, such as industry, company size, and/or the like. -
NACS 1600 may also predict other organization characteristics associated with network addresses 954. In the example above,NACS 1600 generatedprediction values 820 that indicate the probability ofnetwork address 954 being associated with a org location (IS-BIZ). -
FIG. 21 depicts example org characteristics or org features (FORG) generated by theNACS 1600. Referring toFIGS. 16 and 21 , theNACS 1600 may predictorganization types 2140 associated with an external-facingnetwork address 954. Examples ofdifferent org types 2140 are described by Table 6.2. -
TABLE 6.2 Org Type Org Code Description Enterprise ENT A network address of an enterprise such as a corporate office or the like, where an enterprise may be defined as an org with a predefined number of employees such as equal to or more than 100 employees. Small-medium SMB A network address of a small-medium org location, where a small-medium business org is defined as an org with a smaller number of employees than an enterprise, such as less than 100 employees. Co-Workspaces COW A network address of a co-working environment shared by multiple organization entities. The individual orgs in a co-working environment may be small-medium business orgs or enterprise level orgs. Residential RES A network address of a home or residence. Educational EDU A network address associated with an educational institution, such as schools, colleges, or universities. Hotel HOTL A network address of a hotel. Airport AIRP A network address of an airport. Military MIL A network address associated with a military installation, such as a military base, command center, air station, and/or the like Mobile Network MNO A network address of a network to support mobile device internet Operators connectivity Internet Service ISP A network address of an internet service provider or other network to Provider support device internet connectivity. BOT and PROXY B&P A network address of a network that supports non-human Internet traffic. For example, bots may access websites when crawling for content, and proxies may fetch data on behalf of users when navigating on webpages. This category may also include Internet of Things (IoT) devices and/or autonomous sensors. Social Hubs SHUB A network address of a public place where social gatherings are likely to take place, e.g., café, bar, restaurant, park, or other public/social org location. - NetAdr classification model 1618 may generate single prediction values 2144 for a group of
F ORG 2140, such as predicting a network address located at either an IS-BIZ or an IS-SMB. For example, a multiclass classification model 1618 may generateseparate prediction values 2144 for eachdifferent organization type 2140 in FORG vector 2142. In ML, multiclass or multinominal classification refers to classifying instances into one of three or more classes, and binary classification refers to classifying instances into two classes. - As mentioned previously,
CCM 100 may selectively processevents 108 with network addresses associated withparticular organization characteristics 2140. Network address classification substantially improves the performance of computer systems by allowingCCM 100 to filter out and reduce associated computer processing forevents 108 associated withcertain organization characteristics 2140.CCM 100 can also more accurately calculate consumption scores and detect surge events based on theF ORG 2140 associated withevents 108. - The embodiments discussed herein allow the
CCM 100 to generate more accurate intent data than existing/conventional solutions by distinguishing the locations or location types ofvarious events 108, such as by distinguishingcompany events 108 from general public or user/customer events 108. TheCCM 100 uses processing resources more efficiently by generating consumption scores for different types of locations, such as by generating certain consumption scores only for business related intent data and/or other types of consumption scores for events sent from public locations. TheCCM 100 may also provide more secure network analytics by generating consumption scores for network addresses without using PII, sensitive data, and/or confidential data, thereby improving information security for end-users. - The more accurate intent data and consumptions scores allow
service providers 118 to conserve computational and network resources by providing a means for better targeting users so that unwanted and seemingly random content is not distributed to users that do not want such content. This is a technological improvement in that it conserves network and computational resources ofservice providers 118 that distribute this content by reducing the amount of content generated and sent to end-user devices. Network resources may be reduced and/or conserved at end-user devices by reducing or eliminating the need for using resources to receive unwanted content, and computational resources may be reduced and/or conserved at end-user devices by reducing or eliminating the need to implement spam filters and/or reducing the amount of data to be processed when analyzing and/or deleting such content. - In some cases, it may be difficult to identify an org's intent (e.g., company purchasing intent) based on relatively brief user resource accesses (e.g., visits to a webpage, file downloads, and/or the like), relatively little user interactions with a webpage or web app, and/or when a webpage or web app contains relatively little content. However, a pattern of users visiting multiple resources (e.g., vendor sites) associated with the same or similar topics during the same or similar time periods may be used to identify a more urgent topic and/or predict org intent. In embodiments, a classifier (e.g.,
resource classifier 2240 ofFIG. 22 ) may adjustrelevancy scores 802 based on different resource (e.g., website) classifications and producesurge signals 812 that better indicate org interest in purchasing or otherwise consuming a particular product, service, or resource. -
FIG. 22 shows an example of howCCM 100 calculates consumption scores based on resource (e.g., website, platform, or the like) classifications. In this example, acomputer 2200 may operate a client app 2204 (e.g., a browser, desktop/mobile app, and/or the like) to access information objects 112, for example, by sending appropriate HTTP messages or the like, and in response, server-side application(s) may dynamically generate and provide code, scripts, markup documents, and/or other information object(s) 112 to theclient app 2204 to render and display information objects 112 within theclient app 2204 onscreen 2202.Computer 2200,screen 2202, andclient app 2204 may be the same or similar tocomputer 1400,screen 1402, andclient app 1404 discussed previously. Additionally or alternatively, theresource classifier 2240 may be the same or similar as theNACS 1600 discussed previously. - As explained previously,
CCM tag 110 may generateevents 108 for the network/web session that includes various event data 950-958 such as an ID 950 (e.g., a user ID, session ID, app ID, and/or the like), aURL 952 for information objects 112, anetwork address 954, anevent type 956,timestamp 958, and engagement metrics (EM) 1410 indicating various user interactions with information object(s) 112. TheEM 1410 may indicate a level of engagement or interest the user has in information object(s) 112. For example, a user may spend more time on a webpage and scroll through the webpage at a slower speed when the user is more interested in the information object(s) 112. - The
events 108 are provided to theevent processor 244 in the same/similar manner as discussed previously. In this example, theevent processor 244 includes and/or operates aresource classifier 2240 to classify information objects 2242 according to their type or class, and/or according to some other parameters/criteria. The CCM 100 (e.g.,event processor 244 and/or CSG 800) may adjustrelevancy scores 802 and/or the consumption scores 810 m according to the classification of information objects 2242. - For example, a first information object 2242A may be a website associated with a
service provider 118, such as a news reporting/aggregation org, a social media/networking platform, or the like; and a second information object 2242B may be a website associated with a vendor, such as a manufacturer or retailer that sells products or services.CCM 100 may adjustrelevancy score 802 and resultingconsumption scores 810 based on information object(s) 112 being located on publisher information object 2242A or located on vendor information object 2242B. For example, it has been discovered that a user may be closer to making a purchase decision when viewing content on a vendor website 2242B compared to viewing similar content on a publisher website 2242A. Accordingly,CCM 100 may increaserelevancy score 802 associated with information object(s) 112 located on a vendor website 2242B or otherwiseweight relevancy score 802 for information object(s) 112 located on a vendor website 2242B more than information object(s) 112 located on aservice provider 118 website 2242A. -
CCM 100 may use the increasedrelevancy score 802 to calculateconsumption scores 810 as described previously. The classification basedconsumption scores 810 may be used to determinesurges 812 as described with respect toFIG. 9 that more accurately indicate when orgs are ready to purchase or otherwise consume products, services, and/or resources associated withtopics 102. - For purposes of the present disclosure, a service provider website 2242A may refer to any website that focuses more on providing informational content compared to content primarily directed to selling products or services. For example, the
service provider 118 may be a news service or blog that displays news articles and commentary, a service org or marketer that publishes content, a social media platform that publishes third-party and/or social media users' content, and/or the like. For purposes of the present disclosure, a vendor website 2242B may contain content primarily directed toward selling products or services and may include resources/websites operated by manufacturers, retailers, distributers, wholesalers, and/or any other intermediary. - The example explanations below refer to service provider websites and vendor websites. However, it should be understood that the schemes described below may be used to classify any type of website that may have an associated structure, content, or type of user engagement. It should also be understood that the classification schemes described below may be used for classifying any group of content including different content located on the same website or content located for example on servers or cloud systems.
-
FIG. 23 shows an example ofresource classifier 2240 operation. In this embodiment, theresource classifier 2240 generates one ormore graphs 2340 for one or more information objects 2344 (e.g., web resources such as websites, individual web pages, and/or the like) accessed by users or things. In one example, theresource classifier 2240 generates onegraph 2340 for a corresponding information object 2344. Theresource classifier 2240 may use any suitable graph drawing algorithm to generate the graph(s) 2340 such as, for example, a force-based graph algorithm, a spectral layout algorithm, and/or the like, such as those discussed in Tarawneh et al., “A General Introduction To Graph Visualization Techniques”, Visualization of Large and Unstructured Data Sets: Applications in Geospatial Planning, Modeling and Engineering-Proceedings of IRTG 1131 Workshop 2011, Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, pp. 151-164 (2012) and/or Frishman, “Graph Drawing Algorithms in Information Visualization.” Diss. Comp. Sci. Dep., Technion—Israel Institute of Technology (Jan. 2009), available at: http://www.cs.technion.ac.il/users/wwwb/cgi-bin/tr-info.cgi/2009/PHD/PHD-2009-02, each of which are hereby incorporated by reference in their entireties. - The
graph 2340 in the context of the present disclosure refers to a data structure or data type that comprises a number of (or set of) nodes 2348 (also referred to as “vertices 2348”, “points 2348”, or “objects 2348”), which are connected by a number of (or set of) edges 2346, arcs, or lines. Agraph 2340 may be undirected or directed. In this embodiment, thegraph 2340 may be an undirected graph, wherein theedges 2346 have no orientation and/or pairs of nodes 2348 are unordered. In other embodiments, thegraph 2340 may be a directed graph in which edges 2346 have an orientation, or where the pairs of vertices 2348 are ordered. Anedge 2346 has two or more vertices 2348 to which it is attached, called endpoints or nodes 2348.Edges 2346 may be directed or undirected;undirected edges 2346 may be referred to as “lines” and directededges 2346 may be referred to as “arcs” or “arrows.” - In the example of
FIG. 23 , thegraph 2340 includes multiple nodes 2348, where each node 2348 is associated with a content item or other elements on, or accessible through, an information object 2344. In one example, the information object 2344 is a website and each node 2348 is a webpage belonging to the website. In another example, the information object 2344 is a webpage and each node 2348 is a data element that contains a data item, a content item, and/or one or more attributes (if any) (e.g., as indicated by an opening tag, closing tag, and any content therebetween). Additionally or alternatively, one or more of the nodes 2348 may be a component of web app 2344. In another example, thegraph 2340 may be a tree data structure such as a Document Object Model (DOM) data structure of an information object 2344, or one or more elements that make up the information object 2344. The DOM is a data representation of the objects that comprise the structure and content of an information object 2344 (e.g., a webpage or web app, XML document, and/or the like). The DOM is an object-oriented representation of the information object 2344, which can be modified with a scripting language such as JavaScript or the like. The scripting language may utilize a DOM API (e.g., the HTML DOM API or the like) to access and/or manipulate the DOM. In another example, the information object 2344 is a scripting language document (e.g., JavaScript) and each node 2348 is a data element and/or object including any attributes, properties, data/content, and/or the like. In another example, the information object 2344 is an archive file or a file path/directory, and each node 2348 is a file contained inside the archive file or file path/directory including the content of each file (if any). Any of the aforementioned examples could be combined with any other example, and/or any other information object 2344 may be used/analyzed in other embodiments. - As an example, each node 2348 in the
graph 2340 may represent individual web resources (e.g., referred to as “webpages 2348” or “web resource 2348”) on a website 2344, and theedges 2346 between the individual nodes 2348 may represent links or other like relationships between the different nodes 2348 (also referred to as “sublinks 2346” or “links 2346”). In this example, afirst home page 2348A on website 2344 may include sublinks towebpages 2348B-2348H.Webpage 2348G may includesecond level sublinks 2346 towebpages Webpage 2348D may include asecond level sublink 2346 towebpage 23481. -
Resource classifier 1640 may classify information object 2344 based on the structure ofgraph 2346. Continuing with the previous example,home page 2348A ingraph 2340 may includesublinks 2346 to many sub-webpages 2348B-2348H.Graph 2340 may also include only a few webpage sublevels belowhome page 2348A. For example,nodes 2348B-2348H are located on a first sub-level belowhome page 2348A. Only one additional webpage sublevel exists that includeswebpage 23481. - In some embodiments, a website 2344 with a
home page 2348A with a relatively large number ofsublinks 2346 to a large number of first level subpages 2348B-2348H more likely represent a vendor website 2344. For example, a vendor website may include multiple products or services all accessed through the home page. Further, a vendor website 2344 may have a relatively small number oflower level sublinks 2346 and associated webpage sublevels (shallow depth). In this example,resource classifier 1640 may predict website 2344 as associated with a vendor. - In another example,
home page 2348A may include relativelyfew sublinks 2346 to other webpages 2348. Further, there may be many more sublayers of webpages 2348 linked to other webpages. In other words,graph 2340 may have a deeper tree structure. In this example,resource classifier 1640 may predict website 2344 as associated with aservice provider 118. - Based on the structure of
graph 2340 inFIG. 23 ,resource classifier 1640 may predict website 2344 is a vendor website. A company accessing a vendor website may indicate more urgency in a company intent to purchase a product associated with the website. Accordingly,site classifier 1640 may increase the relevancy scores 802 produced from information object(s) 112 accessed from vendor website 2344. - This is just one example of how
resource classifier 1640 may classify websites 2344 based on an associated webpage structure. In other embodiments, theresource classifier 1640 may classify websites 2344 based on one or more machine learning (ML) features 2350 (or simply “features of 2350”) extracted from information objects 2344 (e.g., extracted from HTML in webpages of a website atURLs 952 identified in events 108). - In embodiments, the
resource classifier 1640 first determines if agraph 2340 already exists for the information object 2344 associated withURL 952 inevent 108. If agraph 2340 already exists,resource classifier 1640 may check atimestamp 958 inevent 108 with a timestamp assigned to graph 2340 to determine if thegraph 2340 should be updated (e.g., the timestamp assigned tograph 2340 is earlier in time than thetimestamp 958, or vice versa). If agraph 2340 has not been created for information object 2344 or thegraph 2340 needs or should be updated,resource classifier 1640 obtains the information object and analyzes the elements of the obtained information object (e.g., by downloading the HTML for the webpages on website 2344). - In embodiments, the
resource classifier 1640 extracts or otherwise generates one or more ML features 2350 for each node 2348 and generates an associatedgraph 2340 based on thosefeatures 2350. For example, as afirst feature 2350, theresource classifier 1640 determines the number ofsublinks 2350A for each node 2348 contained in thegraph 2340 based on the data elements and/or other aspects of the information object 2344 (e.g., tags or other data elements in HTML documents). As asecond feature 2350, theresource classifier 1640 identifies/determines the (sub)layer locations 2350B (e.g.,sublinks 2350B) of respective nodes 2348 withingraph 2340. For example,resource classifier 1640 may identify the fewest number ofsublinks 2346 separating a node 2348 from thehomepage node 2348A. - After identifying
sublinks 2350B for each node 2348, theresource classifier 1640 may derivegraph 2340 identifying the relationships between each node 2348. While shown graphically inFIG. 23 ,graph 2340 may also or alternatively be generated in a table format that identifies the relationships between different nodes 2348 and provides additional graph metrics, such as the number of node layers, the number of nodes on each node layer, the number of links for each node layer, and/or other like information/aspects. - As mentioned previously, the number of
sublinks 2350A and/or the association oflinks 2346 with other nodes 2348 may indicate the structure and associated type or class of information object 2344. In one embodiment, a deeper tree structure with more lower level nodes 2348 linked to other lower level nodes 2348 may indicate a service provider website 2344. Additionally or alternatively, a shallower tree structure with fewer node levels or fewer links at higher node levels may indicate a vendor website 2344. - As a
third feature 2350, theresource classifier 1640 may generate atopic profile 2350C for each node 2348. For example,event processor 244 may usecontent analyzer 242 inFIG. 2 to identify a set oftopics 102 contained in an information object (e.g., webpage). Thetopic profile 2350C may provide an aggregated view of content of a particular node 2348. - As a
fourth feature 2350, theresource classifier 1640 may also generate topic similarity values 2350D indicating the similarity oftopics 102 of a particular node 2348 withtopics 102 of other linked nodes 2348 on a higher graph level, the same graph level, lower graph levels, or the similarity withtopics 102 for unlinked nodes 2348 on the same or other graph levels. - The relationships between topics on different nodes 2348 may also indicate the type of webpage 2348. For example, nodes 2348 on a service provider website 2344 may be more disparate and have a wider variety of
topics 2350C than nodes 2348 on a vendor website 2344. In another example, similar topics for nodes 2348 on a same graph level or nodes on a same branch ofgraph 2340 may more likely represent a vendor website. - The
resource classifier 1640 may identifytopic similarities 2350D by identifying the topics on a first webpage, such ashome webpage 2348A. Theresource classifier 1640 then compares the home page topics with the content on a second webpage. Content analyzer 142 inFIG. 2 then generates a set of relevancy scores indicating the relevancy or similarity of the second webpage to the home page. Of course,resource classifier 1640 may use other natural language processing (NLP) and/or Natural Language Understanding (NLU) schemes to identify topic similarities between different nodes 2348. Theresource classifier 1640 may generatetopic similarities 2350D between any linked nodes 2348, nodes 2348 associated with a same or different graph levels, or any other node relationship. - As a
fifth feature 2350, theresource classifier 1640 may generateimpressions 2350E for each information object 2348. As described previously inFIGS. 14 and 15 ,CCM 100 may generateconsumption scores 810 and identify company surges 812 based onuser EM 1410. Theimpressions 2350E may indicate a level of engagement or interest the user has the webpage 2348. For example,impressions 2350E may indicate how long the user dwelled on a particular webpage 2348, how the user scrolled through content in the webpage 2348, touch data when touch interfaces are used, gaze times and/or gaze locations when eye tracking technologies are used, and/or the like. The user may spend more time on a webpage and scroll at a slower speed when more interested in the webpage information object(s) 112. Longer gaze times at certain regions of interest may also indicate user interest in a certain information object or content. - The
resource classifier 1640 may useimpressions 2350E to classify web resources 2344. For example, users on a news website 2344 may on average spend more time reading articles on individual webpages 2348 and may scroll multiple times through relatively long articles. Users on a vendor website 2344 may on average spend less time viewing different products and scroll less on relatively short webpages 2348. A user may also access a news website more frequently, such as every day or several times a day. The user may access vendor websites 2344 much less frequently, such as only when interested in purchasing a particular product or service. In addition, users may spend more time on more webpages of a news-related website when there is a particular news story of interest that may be distributed over several service provider news stories. This additional engagement on the news website could be mistakenly identified as a company surge, when actually the additional engagement is due to a non-purchasing related news topic. On the other hand, users from a same company viewing multiple vendor websites within a relatively short time period, and/or the users viewing the vendor websites with additional engagement, may represent an increased company urgency to purchase a particular product. Accordingly, theresource classifier 1640 may take these different behavior patterns into account when classifying different information objects 2344. It should be noted that other types/classes of information objects 2344 may be identified/determined and theresource classifier 1640 may accommodate or account for different user behaviors for those types/classes of information objects 2344 when performing various classification operations. - The
resource classifier 1640, or another module/element inevent processor 244, may generate engagement scores 812 (“surge scores 812”) for each node 2348 of the information object 2344 as described previously with respect toFIGS. 14 and 15 . Theresource classifier 1640 may then classify the information object 2344 as a particular type/class (e.g., service provider) based at least partially on nodes 2348 having higher engagement scores where users on average spend more time on the webpages 2348, and visit the webpages 2348 more frequently.resource classifier 1640 may classify web resources 2344 as a particular type/class (e.g., a vendor website) based at least partially on webpages 2348 having lower engagement scores where users spend less time on the webpage and visit the webpage less frequently, or have more isolated engagement score increases. In addition,resource classifier 1640 may classify a web resource 2344 as a vendor website when the users view content associated with pricing. - The
resource classifier 1640 may generate anaverage engagement score 812 for the nodes 2348 of the same information object 2344 and use thisaverage engagement score 812 as theengagement score 812 for that information object 2344. Additionally or alternatively, theresource classifier 1640 may increase therelevancy score 802 when the amount and pattern ofengagement scores 812 indicate a vendor website 2344 and may reducerelevancy score 802 when the amount and pattern ofengagement score 812 indicates a service provider website 2344. - Different types of information objects may contain different amounts of content. For example, individual webpages 2348 on a service provider website 2344 may generally contain more text (deeper content) than individual webpages 2348 on a vendor website (shallower content). In embodiments, the
resource classifier 1640 may calculate as asixth feature 2350, the amounts ofcontent 2350F for individual nodes 2348 in information objects 2344. For example,resource classifier 1640 may count the number of words, paragraphs, documents, pictures, videos, images, and/or the like contained in individual webpages 2348. In some embodiments, different weights or scaling factors may be applied to different types of content when determining thesixth feature 2350. - In some embodiments, the
resource classifier 1640 may calculate an average amount ofcontent 2350F in nodes 2348 on the same website 2344. For example, an average content amount (e.g., within some threshold range or the like) may more likely represent a service provider website 2344 and a less-than-average amount ofcontent 2350F (e.g., below some threshold amount) may more likely represent a vendor website 2344. In these cases, theresource classifier 1640 may increaserelevancy score 802 when the average amount ofcontent 2350F indicates a vendor website 2344 and may reducerelevancy score 802 when the average amount ofcontent 2350F indicates a service provider website 2344. - Different types of information objects may contain different types of content. For example, service provider websites 2344 may contain more advertisements than vendor website 2344. In another example, vendor sites may have a “contact us” webpage, product webpages, purchase webpages, and/or the like. A “contact us” link in a service provider website may be hidden in several levels of webpages compared with a vendor website where the “contact us” link may be located on the home page. A vendor website may also have a more prominent hiring/careers webpage. In these embodiments, the
resource classifier 1640 may identify/determine, as aseventh feature 2350, different types and locations ofcontent 2350G in the information object's source code (e.g., webpage HTML). In one example, theresource classifier 1640 may identify inline frames (iframe) in the webpage HTML. The HTML inline frame element (<iframe>) represents a nested browsing context, embedding another HTML page into a current HTML page. An iframe may be an HTML document embedded inside another HTML document and is often used to insert content from another source, such as an advertisement. - Additionally or alternatively, other types of
content 2350G may be associated with particular types of information objects 2344. For example, vendor websites may include more webpages associated with employment opportunities or include webpages identifying the management team of the company. In another example, both service provider webpages and vendor webpages may include links to employment opportunities. However, vendor websites may more frequently locate a prominent link from homepage to employment opportunities service provider websites may more frequently embed links to the employment opportunities among many other links to service provider news content. The total number of links from a vendor homepage may be less and a “Careers” page link will be, for example, 1 out of 10 total links. A service provider homepage may have many more links and include the careers opportunity link nested within them. - The
resource classifier 1640 may also classify web resources 2344 based on these other content type features 2350G and/or content locations features 2350G. The content type features 2350G may be or indicate the type of content embedded in web resources 2344 and/or otherwise rendered within web resources 2344 such as, for example, text, images, graphics, audio, video, animations, and/or the like. The content type features 2350G may also include or account for styles employed by the web resources 2344 (e.g., various color schemes, fonts, and/or the like as indicated by a Cascading Style Sheet (CSS) or other style sheet language documents) and/or various user interface elements employed by the web resources 2344. The content locations features 2350G may include, indicate, or refer to the position and/or orientation of content items within a web resource 2344 with respect to some reference or with respect to some other content item (e.g., based on the CSS position property or the like). In some embodiments,resource classifier 1640 may also identify “infinite scroll” techniques or “virtual page views” asfeatures 2350G that allow web resource visitors to continually scroll through (up/down) a page, and, at end of content, produce a new article to continue reading within the same page without clicking a link. Examples of such websites include Facebook.com, Forbes.com, Businesslnsider.com, and the like. - The
resource classifier 1640 may also classify web resources 2344 based on content update frequency features 2350H. For example, a service provider web resource 2344 may update and/or replace content, such as news articles, more frequently than a vendor website replaces webpage content for products or services. In embodiments, theresource classifier 1640 identifies topics on the web resources 2344, 2348 over some period of time (e.g., every day, week, or month), and generates an update value/feature 2350H indicating the frequently of topics changes on the web resources 2344, 2348 over the period of time. In some implementations, ahigher update values 2350H may indicate service provider resources 2344, 2348 and alower update values 2350H may indicate vendor resources 2344, 2348. - The
resource classifier 1640 may use any combination offeatures 2350 to classify information objects 2344. Additionally, theresource classifier 1640 may weight somefeatures 2350 higher thanother features 2350. For example, theresource classifier 1640 may assign a higher vendor score to a website 2344 identified with ashallow graph structure 2340 compared with identifying website 2344 with relativelyshallow content 2350F. - In embodiments, the
resource classifier 1640 generates a classification value for information object 2344 based on the combination offeatures 2350 and associated weights (if any). Theresource classifier 1640 then adjustsrelevancy score 802 based on the classification value. In one example, theresource classifier 1640 may increaserelevancy score 802 orconsumption score 810 more for a larger vendor classification value and may decreaserelevancy score 802 orconsumption score 810 more for a larger service provider classification value. -
FIG. 24 shows anexample process 2400 for identifyingsurge scores 812 based on resource classifications.Process 2400 begins atoperation 2402 where theresource classifier 1640 receives an event 108 (e.g., from tags 110) that includes various event data such as an ID, URL, event type, engagement metrics, and/or any other information identifying content, activity, user interaction, and/or the like, associated with aninformation object 112. In some embodiments,resource classifier 1640 first may determine if agraph 2340 already exists for theinformation object 112 associated with the URL included in theevent 108. If an up-to-date graph 2340 exists, theresource classifier 1640 may have already classified theinformation object 112. If so,resource classifier 1640 may adjust any derivedrelevancy scores 802 based on the resource classification. Otherwise, theresource classifier 1640 may proceed tooperation 2404 to determine the structure of theinformation object 112. - At
operation 2404, theresource classifier 1640 determines the structure of theinformation object 112 by, for example, analyzing theinformation object 112 to identify the various nodes 2348 making up theinformation object 112. Additionally or alternatively,operation 2404 may include generating agraph 2340 for theinformation object 112. In one example, theresource classifier 1640 crawls through theinformation object 112 and identifies and/or determines each node making up theinformation object 112 and identifying/determining the links/relationships between each of the nodes 2348. In one example, when theinformation object 112 is a website, theresource classifier 1640 starts the crawling beginning at a home page of the website associated with the received event. Additionally or alternatively in this example, theresource classifier 1640 identifies links on the home page to other webpages. Theresource classifier 1640 then identifies links in the HTML of the lower level pages to other pages to generate a website graph ortree structure 2340 as shown inFIG. 23 . In another example, the generatedtree structure 2340 may similar to a DOM or the like. - At
operation 2406, theresource classifier 1640 extracts various features from/for each node 2348 as described previously. For example, when theinformation object 112 is a website, theresource classifier 1640 may identify the number of sublinks, layers of webpages, topics, engagement metrics (e.g., impressions, and/or the like), amounts and types of content, number of updates, and/or the like associated with each webpage. - At
operation 2408, theresource classifier 1640 classifies theinformation object 112 based on the identified/determined structure (e.g., see e.g., operation 2404) and the extracted/generated features 2350 (e.g., see e.g., operation 2406). In one example, theresource classifier 1640 may use any combination of thefeatures 2350 discussed previously to generate a classification value for theinformation object 112. As explained previously, theresource classifier 1640 may also weigh different node features 2350 differently. For example, theresource classifier 1640 may assign a larger weight to a website graph structure indicating a service provider website and assign a lower weight to a particular type of content associated with service provider websites. Based on all of theweighted features 2350, theresource classifier 1640 may generate the classification value predicting the type ofinformation object 112. - At
operation 2410, theresource classifier 1640 adjusts therelevancy score 802 for org topics based on the classification value. For example,resource classifier 1640 may increase therelevancy score 802 more for a larger vendor classification value and may reduce the relevancy score more for a larger service provider classification value. Other implementations are possible in other embodiments. - The
CCM 100 may use the information object structure and features 2350 described previously to improve topic predictions forinformation objects 2344, 112 or for individual nodes 2348. For example, when aninformation object 2344, 112 is a website, theCCM 100 may identify a most influential page 2348 of the website 2344, which may be a page 2348 with the most links, the most content, the most user visits, or having some other aspects/features greater or different than other pages 2348 of the website 2344. Webpages 2348 that are a closer distance to the most influential webpage 2348 (e.g., with fewer number of links or hops from the most influential webpage 2348) may be identified as more influential than webpages 2348 that are at a further distance from the most influential webpage 2348. For example, a webpage 2348 separated from most of the other webpages 2348 and with few sublinks may be identified as less influential in website 2344 than webpages 2348 with more connections to other webpages 2348. In this example, theCCM 100 may increase the topic prediction values for more influential webpages 2348 or webpages 2348 directly connected to the most influential webpages 2348 and/or reduce the topic prediction values for less influential webpages 2348. - In some embodiments, the
resource classifier 1640 may modifyrelevancy scores 802 based on the org associated with the website 2344. For example,resource classifier 1640 may increase therelevancy score 802 for an identified vendor website 2344 and/or theresource classifier 1640 may increaserelevancy score 802 even more for websites 2344 operated by the org requesting theconsumption score 810. - In various embodiments, the
CCM 100 and/or theresource classifier 1640 may use the structure ofgraph 2340 to train topic models. For example, during ML model training, the topic model may generate topic relevancy ratings (e.g., relevancy scores 802) for different information objects 2344, 112 (e.g., individual webpages 2348 of a website 2344). In some cases, the ML model may not accurately identify the topics on a first webpage 2348 but may accurately identify the topics on other closely linked webpages 2348. During training and testing, model performance may be rated not only on the accuracy of identifyingtopics 102 on one particular webpage 2348 but also rated based on the accuracy of identifyingrelated topics 102 on other closely linked pages 2348. - Instead of generating surge scores for
individual topics 102, in some embodiments, theCCM 100 may generate asurge score 812 for a selected bundle oftopics 102. In these embodiments, theCCM 100 may take the average consumption scores 810 for the bundle oftopics 102 to generate oneorg consumption score 102. The org topic bundle may provide a more general relationship indicator for when and how to contact an org. For example, an entity may respond to aspecific topic surge 812 by making a phone call or sending emails regarding a specific product to org personnel (e.g., company employees or the like). The entity may respond to the bundle topic with less aggressive and more general topic information. - The topic bundles can also aggregate views across industries or for any customizable domain level. For example, the
CCM 100 may determine the surging topics for the group of orgs. Asurge 812 identified for the group of orgs may direct another org to increase development or production in the identified topic or topic bundle. - In some embodiments, the
CCM 100 may use different data sources andevents 108 to identify information about the same user. As alluded to previously, the data sources may include Dunn and Bradstreet®, Equifax®, profile data from monitored websites or social media websites, and/or any other third-party sources. The user information may include the user phone numbers, job titles, company names and addresses, email addresses, and/or the like. However, in some cases, some of the data may be outdated or incorrect. For example, the different data sources may identify three different job titles for the same user. - In embodiments, the
CCM 100 may generate a truth sets that ranks the reliability of the data sources. For example, if three data sources provide the same piece of information for a same user, such as job title, each data source may be ranked higher for that particular piece of information. Two of the data sources have the same piece of user information and the third data source has different piece of user information.CMM 100 may rank the third data source lower for that piece of user information. - Thus, the truth set ranks all of the data sources based on the amount of data in agreement with the other data sources. For example, the first data source may have a high ranking for job title and a low ranking for user phone number. The second data source may have a high ranking for email addresses but a low ranking for job titles.
CCM 100 may use the highest ranked data sources for each of the different types user data to populate the user profiles 104B as described previously inFIG. 4 . -
CCM 100 may also compare the derived truth set with other behavioral data generated for the same user. For example, as described previously,CCM 100 may generate a user profile 104A and user intent vector 594 based on theevents 108 associated with the user. - Based on the identified user intent vector 594 and user behavioral profile,
CCM 100 may identify the user as an engineer. For example, the highest relevancy topics for the user may correlate with intent vectors 594 for other users identified as engineers. Similarly, software engineers may more likely to access data from certain types of data sources, such as Stackoverflow.com.CCM 100 may rank the data sources and generate the truth set based on the similarity of user activities and accessed data sources. -
FIG. 25 shows an example of howevent processor 244 convertsraw events 108 into hostname events. As explained previously, CCM tags 110 may captureevents 108 identifying information objects 112 accessed by users during web/network or application sessions.Events 108 may include an ID (user ID, and/or the like) 950,URL 952,network address 954,event type 956, time stamp (TS) 958,engagement metrics 1410, and/or other like information. CCM tags 110 may captureevents 108 from a group of information objects 112 and store theevents 108 in araw events database 2502.Raw events database 2502 can also receiveevents 108 from any other collection system. In one example, a bulk set ofevents 108 can be sourced from another collection entity/service, and loaded into theraw events database 2502. In another example, theraw events database 2502 may be owned/operated by another entity/service, andraw events 108 can be obtained from theraw events database 2502 using suitable APIs and/or the like. -
Event processor 244 inCCM 100 operates anentity predictor 2504 and ahostname extractor 2506 that together operate as a consumption event transform. Theentity predictor 2504 predicts or otherwise determines anentity 2512 associated withnetwork address 954, such as by predicting/determining an org name forentity 2512. For example,entity predictor 2504 may access an NetAdr-Org database 806 (see e.g.,FIG. 8 ) that stores org names for associated network addresses 954 (e.g., IP address or the like).Entity predictor 2504 may also predict/determineentity 2512 for network addresses 954 from user profile data 104 (see e.g.,FIG. 4 ). For example, users may identify their associated orgs during web/network sessions in a manner discussed previously.CCM 100 may store the identified org names in user profile data 104 or in an org profile and then map the org name identified in the user profiles to networkaddress 954. - An org may be associated with
hostname 2510. In one example, a company called “Acme Co.” may sell firewalls software and/or network appliances, and operate a website associated withhostname 2510. The website associated withhostname 2510 may include information about firewall products/services sold or otherwise provided by Acme Co. The company, entity, organization, person, and/or the like associated withhostname 2510 and the associated website is referred to herein as a “first party 2511” and a resource (e.g., website, webpage, and/or the like) associated withhostname 2510 is referred to herein as a “hostname resource 2510.” -
Host name extractor 2506 extracts ahostname 2510 fromURL 952. For example,URL 952 in anevent 108 may include: “http://www.acme.com/about.us.” In this example, thehost name extractor 2506 may identifyhostname 2510 forURL 952 as the domain name “acme.com.” Then, theevent processor 244 generates an enriched set ofhostname events 2508 that replaceURLs 952 with extractedhostnames 2510 and replaces the network addresses 954 with predicted/determined entity/org names 2512. -
FIG. 26 shows an example resource interest detector (RID) 2600 that generates different resource interest features (RIFs) 2622 fromhostname events 2508. The RID 2600 includes aRIF generator 2620 that generates one ormore RIFs 2622. TheRIFs 2622 may be machine learning (ML) features. In one example implementation, theevent processor 244 inCCM 100 generatesdifferent RIFs 2622 fromhostname events 2508. In this example implementation, theevent processor 244 operates theRIF generator 2620 to generateRIFs 2622 similarly to how consumption scores 810 are generated as discussed previously for a particular topic and org. However, in this example implementation, theevent processor 244 generatesRIFs 2622 based on events generated byentity 2512 while accessing one ormore hostname resources 2510. - In embodiments, the
feature generator 2620 aggregateshostname events 2508 based onentity 2512 andhostname 2510 to generate (or compute)specific RIFs 2622. For example, a set ofhostname events 2508 may includeentity 2512 for Org X and may include ahostname 2510 for hostname resource 2510 (e.g., Acme.com). Thesehostname events 2508 represent interactions ofentity 2512/Org X (entity) with the Acme.com website. - In various embodiments, the
RIFs 2622 are metrics specifically engineered to captureentity 2512 interest to ahostname resource 2510. In these embodiments, thefeature generator 2620 generates anevent count feature 2622A (Fec) which may be an event count ratio, aunique user feature 2622B (Fuu) which may be a unique user count ratio, and anengagement score feature 2622C (Fes) which may be an engagement score ratio. Thefeature generator 2620 generates theseRIFs 2622 fromindividual events 2508 accumulated over a predetermined time period (e.g., each day, week, month, hour, and/or any other time period) from therespective hostname resources 2510.Different RIFs 2622 may be generated fordifferent hostname resources 2510 in other embodiments. In alternative embodiments, theRIFs 2622 may be other metrics that are specifically designed/engineered for other purposes/use cases. -
FIG. 27 shows an example process for generating event count feature (Fec) 2622A byfeature generator 2620. This process begins atoperation 2701 wherefeature generator 2620 determines (e.g., counts) the total number of events aparticular entity 2512 generates from all web resources over a predetermined period of time (e.g., a day, week, month, hour, and/or any other time period). For example, over one period of time (e.g., one day) employees of Org X (entity 2512) may access a variety of different websites and generate a total of 4350 events. - At
operation 2702, thefeature generator 2620 counts the number ofevents entity 2512 generates from ahostname resource 2510. Continuing with the previous example, over the same day employees of Org X may access the Acme.com website a total of 340 times. In other words, there may be 340events 2508 that include the hostname/entity combination {Acme.com, Org X}. Atoperation 2703, thefeature generator 2620 determines/calculates a relationship of hostname related events derived inoperation 2702 to the total number of events derived inoperation 2701. Continuing with the previous example,feature generator 2620 may calculate an event count feature (Fec) or event count ratio as: -
- In some embodiments, the
feature generator 2620 considers additional normalization methods inoperation 2703 to control for global variance in counts of events collected by CCM tags 110. -
FIG. 28 shows an example process for generating a unique user feature (Fuu) by thefeature generator 2620. This process begins atoperation 2801 where thefeature generator 2620 determines (e.g., counts) the total number of unique users forentity 2512 that accessed any information object over a predetermined period of time (e.g., a day, week, month, hour, and/or any other time period). For example, thefeature generator 2620 may count the total number ofunique user IDs 950 associated with Org X that generatedevents 2508 from any web resource. Inoperation 2802, thefeature generator 2620 determines (e.g., counts) the number of unique users fromentity 2512 that generated events fromhostname resource 2510. For example,feature generator 2620 may count the number ofunique user IDs 950 inevents 2508 that include hostname Acme.com and entity Org X. In operation 2803,feature generator 2620 determines (e.g., calculates) the relationship of unique users forentity 2512 that accessedhostname resource 2510 to the total number of unique users forentity 2512 that accessed content on any resource. For example,feature generator 2620 divides the number of unique users counted inoperation 2802 by the number of unique users counted inoperation 2801.Feature generator 2620 might consider additional normalization methods in operation 2803 to control for global variance in unique users in events collected by CCM tags 110. -
FIG. 29 shows an example process for generating an engagement score feature (Fes) by thefeature generator 2620. This process begins atoperation 2901 where thefeature generator 2620 generates engagement scores for the content accessed byentity 2512 over a predetermined period of time (e.g., a day, week, month, hour, and/or any other time period). As explained previously inFIG. 15 ,event generator 240 may receiveevents 108 that includeengagement metrics 1410 such as content impressions and/or the like. Theengagement metrics 1410 may identify user interactions with information objects 112 including tab selections that switch to different pages, page movements, mouse page scrolls, mouse clicks, mouse movements, scroll bar page scrolls, keyboard page movements, touch screen page scrolls, gaze locations, touch coordinates and/or touch pressure data, and/or any other content movement or content manipulation indicator(s). As alluded to previously, theevent processor 244 may assign higher engagement scores toengagement metrics 1410 that indicate a higher user interest and assign lower engagement scores toengagement metrics 1410 that indicate lower user interest. For example,event processor 244 may assign a larger engagement score when the user spends more time actively dwelling on a page and may assign a smaller engagement score when the user spends less time actively dwelling on a page. Inoperation 2901,feature generator 2620 may add up, average, or perform some other calculation or apply one or more functions to all of the engagement scores generated from all information objects 112 accessed byentity 2512 over the predefined time period. - At
operation 2902, thefeature generator 2620 determines engagement scores generated by anentity 2512 from information objects 112 on or at ahostname resource 2510. In one example, thefeature generator 2620 may add up, or average, or apply some other suitable function to all engagement scores generated from information objects 112 accessed onhostname resource 2510 byentity 2512 over the predefined time period. Inoperation 2903, thefeature generator 2620 calculates the ratio of hostname related engagement scores to all engagement scores generated by theentity 2512. In some embodiments, thefeature generator 2620 might consider additional normalization methods inoperation 2903 to control for global variance of engagement scores in events collected by CCM tags 110. RIFs Fec, Fuu, and Fes indicate the interest ofentity 2512 inhostname resource 2510. For example,RIFs 2622 may indicate the interest of Org X in the Acme.com website. -
FIG. 30 shows agraph 3000 ofRIFs feature generator 2620 calculates theevent count feature 2622A,unique user feature 2622B, andengagement score feature 2622C each day for a series of days. The feature generator 2620 (e.g., operated byevent processor 244 or some other suitable processor circuitry) may useRIFs days 3062 as a baseline for comparing with RIFs generated oversubsequent target days 3064. For example, feature generator 2620 (e.g., event processor 244) may calculatebaseline distributions RIFs baseline days 3062. - The feature generator 2620 (e.g., operated by
event processor 244 or some other suitable processor circuitry) may identify threshold regions 3068 and 3070 for each baseline distribution 3066. For example, threshold regions 3068 may be the lowest 10% ofRIFs 2622 in baseline distributions 3066 and threshold regions 3070 may be the highest 10% ofRIFs 2622 in baseline distributions 3066. Other threshold levels could be selected for baseline distributions 3066 in other embodiments. - In this example, the feature generator 2620 (e.g., operated by
event processor 244 or some other suitable processor circuitry) comparesRIFs 2622 for each time period (e.g., each day) duringcurrent target period 3064 with associated baseline distributions 3066. The feature generator 2620 (e.g., operated byevent processor 244 or some other suitable processor circuitry) may generate a notification when aRIF 2622 for any oftarget days 3064 is located within one of threshold regions 3068 or 3070. - A
RIF 2622 within threshold range 3068 or 3070 may indicate a change in the interest ofentity 2510 inhostname resource 2510. For example,feature generator 2620 may calculate anevent count feature 2622A (Fec) forday 17.Event count feature 2622A may lie withinthreshold region 3068A ofbaseline distribution 3066A. This may indicateentity 2510 reduced access tohostname resource 2510 relative to other websites and may have lost interest inhostname resource 2510. - In another example,
feature generator 2620 may calculateunique user feature 2622B (Fuu) forday 19.Unique user feature 2622B may lie withinthreshold region 3070B ofbaseline distribution 3066B. This indicates the number of unique users forentity 2510 accessinghostname resource 2510 has increased relative to all other websites. This may indicate an increased interest ofentity 2510 inhostname resource 2510. - The feature generator 2620 (e.g., operated by
event processor 244 or some other suitable processor circuitry) may generate a resource interest score SRI by calculating the sum for all threeRIFs event processor 244 may multiply eachRIF 2622 by a scaling factor β and then add the three scaledRIF entity 2512 inhostname resource 2510 relative to all other web resources. -
FIG. 31A shows an example of how the feature generator 2620 (e.g., operated byevent processor 244 or some other suitable processor circuitry) calculates a resource cluster interest score (SRCI). The SRCI indicates a level of interest of anentity 2512 in a cluster 3132 of resources 3130 selected by afirst party 2511. For example, thefirst party 2511, such as the org “Acme,” may manage multiple hostname resources 3132 (e.g., websites or webpages) that contribute to its marketing and customer outreach efforts, such as www.acme.com, www.acme.co.uk, iot.acme.com, and the like. A different resource cluster 3132 may be associated with various resources (e.g., websites) to reach their potential customers. -
First party 2511 may provide a resource cluster weighting vectorW R, whereW R=[resource weight 1,resource weight 2, . . . , resource weight n] (where n is a number). The resource cluster weighting vectorW R comprises a set ofweights 3134 for applying to web resource interest scores SRI associated with thesame hostname resources 2510. Each weight of the set ofweights 3134 may be applied to a corresponding resource in the set ofresources 3110. For example, Acme may own and managemultiple websites 2510, such as www.acme.com, www.acme.co.uk, iot.acme.com, and the like. Acme may provide a resource cluster weighting vectorW R that might assignlarger weights 3134 to global hostname resources, such as www.acme.com, in comparison toother hostname resources 2510. Resource cluster weighting vectorW R may be assembled manually or the feature generator 2620 (e.g., operated byevent processor 244 or some other suitable processor circuitry) may deriveweightings 3134 by crawling content onhostname resources 2510. For example, the feature generator 2620 (e.g., operated byevent processor 244 or some other suitable processor circuitry) may assignlarger resource weightings 3134 tohostname resources 2510 containing more content similar to a defined topic cluster 3126 (see e.g.,FIG. 31B ). - Feature generator 2620 (e.g., event processor 244) calculates resource cluster interest score SRCI by computing the magnitude of the vector that is the result of the entrywise product of resource interest score vector
S RI and resource cluster weighting vectorW R. For example, the resource cluster interest score SRCI may be computed according to equation 11.1: -
S RCI =∥S RI ∘W R∥ [Equation 11.1] - The resource cluster interest score SWCI represents an average interest level of
entity 2512 in resource cluster 3132. SWCI is based on content accessed byentity 2512 fromhostname resources 2510. -
FIG. 31B shows an example of how the feature generator 2620 (e.g., operated byevent processor 244 or some other suitable processor circuitry) calculates a topic cluster interest score STCI. The topic cluster interest score STCI indicates a level of interest ofentity 2512 has in a cluster of topics selected by afirst party 2511. For example, thefirst party 2511, such as Acme, may sell firewalls and may subscribe to one or more topic clusters 3126. A different topic cluster 3126 may be associated with each of the subjects of interest to Acme, such as virtualization, servers, security, and/or the like. - The feature generator 2620 (e.g., operated by
event processor 244 or some other suitable processor circuitry) generatesconsumption scores 810 forentity 2512 for each of the topics 3125 in the subscribed topic cluster 3126 as described previously with respect toFIG. 9 . The cluster of topic consumption scores 810 is referred to as a topic interest score vectorS TI. Consumption scores 810 are generated from all content accessed byentity 2512 including content accessed onhostname resources 2510 and content accessed on other third party websites. - The
first party 2511 may provide a topic cluster weighting vectorW t, whereW t=[topic weight 1,topic weight 2, . . . , topic weight n] (where n is a number). The topic cluster weighting vectorW t may be a set ofweights 3128 for applying to associated topic consumption scores 810. Each weight of the set ofweights 3128 may be applied to a corresponding topic orconsumption score 810 in the topic cluster interest score STCI. For example, Acme may sell firewalls, and therefore, Acme may provide a topic cluster weighting vectorW t that assignslarger weights 3128 to firewall related topics 3125 compared to other topics 3125. Topic cluster weighting vectorW t may be assembled manually orevent processor 244 may deriveweightings 3128 by crawling content onhostname resources 2510. For example,event processor 244 may assign larger topic weightings 3128 to topics 3125 more frequently identified onhostname resources 2510. -
Event processor 244 calculates topic cluster interest score STCI=∥S TI∘W t∥ by computing the magnitude of the vector that is the result of the entrywise product of topic interest score vectorS TI and topic cluster weighting vectorW t. Topic cluster interest score STCI represents an average interest level ofentity 2512 in topic cluster 3126. STCI is based on all content accessed byentity 2512, including content fromhostname resource 2510 and any other third party websites. -
FIG. 32 shows an example of how anevent processor 244 combines resource cluster interest score SRCI with topic cluster interest score STCI to generate a first party weighted intent score. In some examples, the weighted intent score may be alternatively referred to as a weighted intent score (SBI). As explained previously,first party 2511 refers to the org associated with ahostname resources 2510 in resource cluster 3132. For example, the company Acme may be afirst party 2511 that operates the following hostname resources 2510: Acme.com, Acme.co.uk, and IoT.Acme.com. The SBI may help determine ifentity 2512 is interested in the products or services sold or otherwise provided by first party 2511 (i.e., Acme in this example). - In the example of
FIG. 32 , theevent processor 244 operates anentity predictor 2504 andhostname extractor 2506 that convertraw events 108 intohostname events 2508 in the same or similar manner as discussed previously with respect toFIG. 25 .Hostname events 2508 identify ahostname 2510 for a URL in araw event 108 and identify anentity 2512 for a network address (e.g., IP address) in theraw event 108. Theevent processor 244 operates aRIF generator 2620 to generateRIFs 2622 fromhostname events 2508 that indicate an interest level ofentity 2512 inhostname resources 2510.Event processor 244 also operates an interest score generator (ISG) 3272 that calculates resource interest scores SWI by adding together, or otherwise combining, RIFs 2622 (e.g., Fee, Fuu, and Fes) for the same time periods for thesame hostname resources 2510. -
First party 2511 may define a resource cluster 3132 associated with a group of resource (e.g., websites) 2510 owned and/or managed by thefirst party 2511. In some embodiments, thefirst party 2511 provides resource cluster weighting vectorW R. Additionally or alternatively, theevent processor 244 automatically generates resource cluster weighting vectorW R by crawlinghostname resources 2510, and finds content similarity to predefined topic cluster 3126. Theevent processor 244 also operates resource cluster interest score generator (RCISG) 3273 to calculate resource cluster interest score SWCI by computing the magnitude of the vector that is the result of the entrywise product of web resource interest score vectorS RI and resource cluster weighting vectorW R. In one example, resource cluster interest score SRCI=∥S RI ∘W R∥. - As also explained previously,
first party 2511 associated withhostname resources 2510 may subscribe to a topic cluster 3126 associated with a particular subject (e.g., firewalls in this example). Theevent processor 244 operates theCSG 800 to generate a set ofconsumption scores 810 for the topic cluster 3126, which may be referred to as topic interest score vectorS TI.First party 2511 may also provide topic cluster weighting vectorW t orevent processor 244 may automatically generateW t by crawlinghostname resources 2510. Theevent processor 244 also operates a topic cluster interest score generator (TCISG) 3274 to calculate topic cluster interest score STCI by computing the magnitude of the vector that is the result of the entrywise product of topic interest score vectorS TI and topic cluster weighting vectorW t. In one example, topic cluster interest score STCI=∥S TI ∘W t∥. - The
event processor 244 also operates a weighted intent score generator (WISG) 3276 to generate weighted intent score (SBI) based on a combination of resource cluster interest score SWCI and topic cluster interest score STCI. In one example, the weighted intent score may be calculated according to equation 11.4: -
- In Equation 11.4, STCI is the topic cluster interest score, SWCI is the resource cluster interest score, αTCI is a topic cluster interest threshold, and αRCI is a resource cluster interest threshold. During a surge in weighted intent score SBI, one or both of scores SRCI and STCI may exceed associated thresholds αRCI and αTCI, respectively. For example,
event processor 244 may identify a surge forentity 2512 when topic cluster interest score STCI exceeds topic cluster interest threshold αTCI or resource cluster interest score SWCI exceeds resource cluster interest threshold αRCI. Thresholds αTCI and αRCI may be derived based on baseline distributions as described previously inFIG. 30 or may be based on any other a priori data. -
FIG. 33 shows agraph 3300 for weighted intent score SBI. The Y axis represents topic cluster interest score Sir′ and the X axis represents resource cluster interest score SWCI. Graph 3300 shows how weighted intent score SBI ties the interest ofentity 2512 in a topic cluster 3126 (see e.g.,FIG. 31B ) with the interest ofentity 2512 in hostname resource cluster 3132 (see e.g.,FIG. 31A ). - Any value of SBI exceeding a
threshold 3380 may indicate a surge byentity 2512. For example, a value of weighted intent score SBI withinregion 3382 may indicate a surge in the interest ofentity 2512 in topic cluster 3126 and/or hostname resource cluster 3132. In one example, a weighted intent score SBI greater than a threshold value of 1 may indicate a surge byentity 2512. Of course threshold 780 depends on the weightings and normalizations applied to weighted intent score parameters. - In some embodiments, the
CCM 100 may send a notification to thefirst party 2511 associated withhostname resources 2510 identifying the surge byentity 2512. First party 2511 (e.g., Acme in the previous examples) may send information or call employees of entity 2512 (e.g., Org X in the previous examples). For example, the first party 2511 (e.g., Acme) may (or may direct suitable personnel to) call or send email advertisements, literature, direct mailings, or banner ads for related products to employees of entity 2512 (e.g., Org X). - In some implementations, the weighted intent score SBI, topic cluster interest score STCI, and/or resource cluster interest score SWCI can be used to measure account-based advertising performance. For example, the
event processor 244 may compare weighted intent score SBI with advertising content sent to specific companies or employees of companies.Event processor 244 may measure the increase of visits tohostname resource 2510, such as Acme.com, tying the targeted companies to the companies visiting Acme.com. Increases in weighted intent score SBI for companies that have received advertising suggests a particular ad campaign may be outperforming another and therefore should have increased investment. - The embodiments discussed herein allow the
CCM 100 to generate more accurate intent data than existing/conventional solutions by classifying resources and enhancing consumption scores and surge signals based on improved resource classifications (in comparison to existing solutions). TheCCM 100 uses processing resources more efficiently by generating consumption scores based on the improved classifications. TheCCM 100 may also provide more secure network analytics by generating consumption scores without using PII, sensitive data, and/or confidential data, thereby improving information security for end-users. - The more accurate intent data, consumptions scores, and/or surge signals allow
service providers 118 to conserve computational and network resources by providing a means for better targeting users so that unwanted and seemingly random content is not distributed to users that do not want such content. This is a technological improvement in that it conserves network and computational resources of service providers' 118 computing systems/platforms that are used to distribute this content by reducing the amount of content generated and sent to end-user devices. Network resources may be reduced and/or conserved at end-user devices by reducing or eliminating the need for using resources to receive unwanted content, and computational resources may be reduced and/or conserved at end-user devices by reducing or eliminating the need to implement spam filters and/or reducing the amount of data to be processed when analyzing and/or deleting such content. - In various embodiments, the
resource classifier 2240 may generate vectors that represent the different features of resources (e.g., webpages, websites, and/or other InObs), and uses a suitable machine learning (ML) model to classify the different resources based on the feature vectors (the feature vectors may be referred to herein as “resource embeddings”, “webpage embeddings”, or the like). The feature vectors provide more accurate resource classifications than existing classification techniques while using fewer computing resources for classification tasks than existing classification techniques. -
FIG. 34 shows an example structure for anetwork 3400 that includesmultiple resources 3401, including resource 3401-0, resource 3401-1, resource 3401-2, resource 3401-3, and resource 3401-4 (alternatively referred to as W0, W1, W2, W3, and W4, respectively). In one example,individual resources 3401 are associated with different types of orgs, host/serve different content, and/or have other aspects and/or properties.Individual resources 3401 may be classified (or assigned to one or more classes) based on one or more aspects and/or properties of theindividual resources 3401. - One or
more resources 3401 may include a collection of resources 3402 alternatively referred to as nodes. Eachresource 3401 may include aroot node 3402A and a set of other lowertiered nodes 3402B. Each resource 3402 has a specific identifier or address alternatively referred to as a link 3404. One ormore resources 3401 may reference or link to other resources 3402 belonging to asame resource 3401 and/or other resources 3402 belonging to anotherresource 3401. In one example, eachresource 3401 may be a website and each resource 3402 may be a webpage that is part of a website. In this example, webpages 3402 may includeURLs 3404A that link to other webpages 3402 within thesame website 3401 and/or may includeURLs 3404B that link to webpages 3402 onother resources 3401. - In the example of
FIG. 34 , W0 and W1 are vendor websites, W2 is a marketer website, W3 is a news website, and W4 is any other class of website. As explained previously, vendor websites W0 and W1 may contain content primarily directed toward selling or promoting products or services and may include websites operated by manufacturers, retailers, or any other intermediary. Marketer websites W2 may be operated by organizations that provide content directed to marketing or promoting different products, such as an online trade magazine. News websites W3 may be operated by news services or blogs that contain news articles and commentary on a wide variety of different subjects. Website W4 may be any other class of website. For example, website W4 may be a website operated by an individual or operated by an entity not primarily focused on selling products or services. - Still referring to
FIG. 34 , acrossresources 3401, the relationships (e.g., links 3404A) between webpages 3402 on thesame resources 3401 and relationships (e.g., links 3404B) between webpages 3402 onother resources 3401 are referred to generally as structural semantics. In one example, theresource classifier 1640 uses links 3404 to capture the structural semantics across allresources 3401. - As explained previously, vendor websites W0 and W1 may have different structural semantics than marketer website W2 or news website W3. For example, vendor website W0 may have a different tree structure of
links 3404A fromroot node 3402A tolower nodes 3402B compared with marketer website W2 or news website W3. Vendor websites W0 and W1 also may have more links fromroot node 3402A tolower level resources 3402B. Vendor website W0 also may have relativelyfewer links 3404B toother resources 3401, compared with marketer website W2 or news website W3. In this example, there are noexternal links 3404B connecting the two vendor websites W0 and W1 together. However, marketer website W2 and news website W3 may discuss products or services sold on vendor websites W0 and W1, and therefore, may include moreexternal links 3404B to theseresources 3401. Thus, marketer website W2 and news website W3 may have the unique quality of includingmore links 3404B to webpages 3402 on vendor websites W0 and W1. - In some implementations, the
resource classifier 1640 uses these relationships to capture the structural semantics across allInObs 112 of a set ofInObs 112. In one example, an analyzer (e.g.,resource analyzer 3612 ofFIG. 36 ) systematically browsesindividual InObs 112 to identify what is conceptually equivalent to a language for aparticular network 3400. The analyzer may start from a particular node 3402 in an InOb (website) 3401 and identify paths to other nodes. For example, the analyzer may identify the following path [2, 1, 3, 5, 8] formed by links 3404 in resources 3402 referencing other resources 3402. - In
FIG. 34 ,node 2 of website W1 is linked through ahyperlink 3404A tonode 1 in website W1,node 1 in website W1 is linked through anotherhyperlink 3404A tonode 3 in website W1,node 3 in website W1 is linked through anotherhyperlink 3404B tonode 5 in website W2, andnode 5 in website W2 is linked through anotherhyperlink 3404A tonode 8 in website W2, and/or the like. - The generated path [2, 1, 3, 5, 8] is conceptually equivalent or similar to a sentence of words, effectively representing an instance of a natural language structure for
network 3400 or set ofInObs 112. Suitable word embedding techniques in NLP, such as Word2Vec (see e.g., Mikolov et al., Efficient Estimation of Word Representations in Vector Space, arXiv preprint arXiv:1301.3781 (16 Jan. 2013) (“[Mikolov]”), which hereby incorporated by reference in its entirety) are used to convert individual words found across numerous examples of sentences within a corpus of documents into low-dimensional vectors, capturing the semantic structure of their proximity to other words, as exists in human language. Similarly, website/network (graph) embedding techniques such as Large-scale Information Network Embedding (LINE), Graph Neural Network (GNN) such as DeepWalk (see e.g., Perozzi et al., DeepWalk: Online Learning of Social Representations, arXiv:1403.6652v2 (27 Jun. 2014), https://arxiv.org/pdf/1403.6652.pdf (“[Perozzi]”), which hereby incorporated by reference in its entirety), GraphSAGE (see e.g., Hamilton et al., Inductive Representation Learning on Large Graphs, arXiv:1706.02216v4 (10 Sep. 2018) (“[Hamilton]”), which hereby incorporated by reference in its entirety), and/or the like can be used to convert sequences ofInObs 112 found across a collection of InObs 112 (e.g., a collection of referenced websites) into low-dimensional vectors, capturing the semantic structure of their relationship to other pages. - The
resource classifier 1640 uses suitable NLP/NLU technique(s) to convert the different paths, such as path [2, 1, 3, 5, 8] fornode 2, into structural semantic vector(s) 3406B (also referred to as “embeddings”). The resource classifier may generate structural semantic vectors for eachInOb 112 and feeds these vectors into a suitable ML model to classify theInObs 112. In one example, theresource classifier 1640 may generate structuralsemantic vectors 3406B for each resource 3402 in thesame resource 3401. Theresource classifier 1640 then combines the structuralsemantic vectors 3406B for thesame resource 3401 together via a summation to generate a resource structuralsemantic vector 3406A. In this example, theresource classifier 1640 feedsresource vectors 3406A into a logistic regression model (and/or some other suitable ML model) that then classifies theresource 3401 as a particular type of resource (e.g., as a vendor, marketer, or news provider in this example). -
FIG. 35 shows in more detail oneparticular resource 3401. As mentioned above, theresource classifier 1640 may classifyresource 3401 based on structural semantic features. Theresource classifier 1640 may generate and use additional features of webpages 3402 to classifyresource 3401. Features generated by theresource classifier 1640 may include but is not limited to the features described in Table F1. -
TABLE F1 Feature Feature Name Description Feature Structural structural semantics F1 may be generated based on F1 Semantics the structural relationships between information objects such as webpages 3402 provided by references/links such as hyperlinks 3404 Feature Content Content semantics F2 may capture the language and F2 Semantics metadata semantics of content contained within information objects such as webpages 3402. Feature Topics Topic features include identified topics contained F3 Semantics in information objects such as webpages 3402. Semantic features may include semantic relationships between two or more words or topics. Feature Content Content interaction behavior is alternatively F4 Interaction referred to as content consumption or content use Behavior Feature Entity The entity type feature identifies types or locations F5 Type of industries, companies, organizations, bot-based applications or users accessing the webpage Feature Lexical Lexical semantics refers to the grammatical structure F6 Semantics of information objects 112, and the relationships between individual words in a particular context. - Content semantics (feature F2) capture the language and metadata semantics of content contained within webpages 3402. For example, a trained NLP/NLU ML model may predict topics associated with the InObs, such as sports, religion, politics, fashion, or travel. Of course, any other topic taxonomy may be considered to predict topics from webpage content. In addition, the
resource classifier 1640 can also identify content metadata, such as the breath of content, number of pages of content, number of words in webpage content, number of topics in webpage content, number of changes in webpage content, and/or the like. Content semantics F2 also may include any other HTML elements that may be associated with different types of resources, such as Iframes, document object models (DOMs), and/or the like. - Similar to structural semantic features (e.g., feature F1), vendor, marketing, and
news resources 3401 may have different content semantics (feature F2). For example, a news website W3 may include content with more topics compared with a vendor website W0 that may be limited to a small set of topics related to their products or services. Content on news website W3 also may change more frequently compared to vendor website W0. For example, content on news website W3 may change daily and content on vendor website W0 related to products or services may change weekly or monthly. - Topic semantics (feature F3) may involve identifying topics and generating associated topic vectors as described above in
FIG. 2 . For example,CCM 100 may identify different business-related topics (e.g., B2b topics) in each webpage 3402, such as, for example, network security, servers, virtual private networks, and/or any other topic(s). - Content interaction behavior (feature F4) identifies patterns of user interaction/consumption on webpages 3402. For example, news site W3 in
FIG. 19 may receive more continuous user interaction/consumption throughout the day and over the entire week and weekend. Marketer website W2 (e.g., trade publications) and vendor sites W0 and W1 may have more volatile user consumption mostly restricted to work hours during the work week. Types of user consumption reflected in feature F4 may include, but is not limited to time of day, day of week, total amount of content consumed/viewed by the user, device type, percentages of different device types used for accessingInObs 112, duration of time users spend on anInOb 112 and total engagement user has on theInOb 112, the number of distinct user profiles accessing theInOb 112 vs. total number of events for theInOb 112, dwell time, scroll depth, scroll velocity, variance in content consumption over time, tab selections that switch todifferent InObs 112, page movements, mouse page scrolls, mouse clicks, mouse movements, scroll bar page scrolls, keyboard page movements, touch screen page scrolls, eye tracking data (e.g., gaze locations, gaze times, gaze regions of interest, eye movement frequency, speed, orientations, and/or the like), touch data (e.g., touch gestures, and/or the like), and/or the like. Identifying different event types associated with these different user content interaction behaviors (consumption) and associated engagement scores is described in more detail herein. For example, theresource classifier 1640 may generate the content interaction feature F4 based on the event types and engagement metrics identified inevents 108 associated with each webpage 3402. - In one example for Feature F5, the entity type feature identifies types or locations of industries, companies, organizations, bot-based applications or users accessing a
particular InOb 112. For example, theCCM 100 may identify eachuser event 108 as associated with a particular enterprise, institution, mobile network operator, bots/crawls and/or other applications, and the like. Details on how to identify types of orgs and/or locations from whichInObs 112 are accessed is described in U.S. application Ser. No. 17/153,673, filed Jan. 20, 2021, which is hereby incorporated by reference in its entirety. - Lexical semantics (feature F6) may be derived from an initial NLP/NLU analysis of the
InObs 112 to identify lexical aspects of theInObs 112. As examples, these lexical aspects may include hyponyms (specific lexical items of a generic lexical item (hypernym), meronom (a logical arrangement of text and words that denotes a constituent part of or member of something), polysemy (a relationship between the meanings of words or phrases, although slightly different, share a common core), synonyms (words that have the same sense or nearly the same meaning as another), antonyms (words that have close to opposite meanings), homonyms (two words that are sound the same and are spelled alike but have a different meaning), and/or the like - Structural semantics (feature F1), content semantics (feature F2), topic semantics (feature F3), and/or lexical semantics (feature F6) may be collectively referred to as “information object semantic features”, “website semantic features”, or “resource semantic features.” Content interaction behavior (feature F4), entity type (feature F5), and any other user interactions with webpages may be collectively referred to as “behavioral features.”
- In one example, the
resource classifier 1640 generates one or more feature vectors F1-F5 for each resource 3402. Theresource classifier 1640 then combines all of the same resource feature vectors to generate an overallresource feature vector 3406. For example, theresource classifier 1640 may add together the structural semantics feature vectors F1 generated for each of the individual resources 3402 in aresource 3401. Theresource classifier 1640 then divides the sum by the number of resources 3402 to generate an average structural semantics feature vector F1 forresource 3401. - The
resource classifier 1640 performs the same or similar averaging for each of the other features F2-F5 to form a combinedfeature vector 3406. Theresource classifier 1640 feeds combinedfeature value 3406 into an ML model that classifiesresource 3401 as either a vendor, marketer, or news site. Again, this is just one example, and any combination of features F1-F5, or any other features, can be used to classifyresource 3401. -
FIG. 36 shows an example of how theresource classifier 1640 generatesfeature vectors 3608. In this example, thefeature vectors 3608 are vectors generated for features F1-F5. As explained previously,CCM 100 obtainsInOb 3610 from a plurality of resources 3401 (e.g., millions or billions ofresources 3401 in some implementations).InOb 3610 may include the markup (e.g., HTML, XML, and/or the like), script, program code, and/or other content from each webpage 3402. Additionally or alternatively, theInOb 3610 may include any text, video, audio, or any other data included with the markup, script, program code, and/or other content. - One or multiple resource analyzers (RAs) 3612 may start at random webpages 3402 within different resources and proceed/walk different paths through other webpages 3402. The
RAs 3612 may be applications/engines that run/execute automated tasks (e.g., scripts or the like). TheRAs 3612 may sometimes be referred to as “crawlers,” “bots”, and/or the like. TheRAs 3612 identify the different paths through the different resources as explained previously with respect toFIG. 34 and/or using a suitable graph search/analysis algorithm. The paths are used for generating the structural semantics of each webpage 3402.InOb 3610 for each webpage 3402 is parsed to identify the different content semantics. Independent of the features generated from web crawling, content consumption events associated with each webpage are also processed to identify the behavioral features of each webpage 3402. -
Vectors 3608 are then generated for each of the identified features F1-F5. In this example, vector 3608_1 represents the structural semantics feature F1 for webpage 3402_1, vector 3608_2 represents the content semantics feature F2 for webpage 3402_1, vector 3608_3 represents the topic feature F3 for webpage 3402_1, vector 3608_4 represents the content interaction feature F4 for webpage 3402_1, and vector 3608_4 represents the entity type feature F5 for webpage 3402_1. -
TABLE F2 Vector Feature Vector 3608_1 structural semantics [0, 1, 1, 0] feature F1 Vector 3608_2 content semantics [1, 1, 1, 0] feature F2 Vector 3608_3 topic feature F3 [0, 0, 0, 0] Vector 3608_4 content interaction [1, 1, 0, 1] feature F4 Vector 3608_5 entity type feature F5 [0, 0, 1, 0] - For example,
resource analyzer 3612 fetches HTML for a webpage 3402_1.RA 3612 finds a link 3404_1 to a next lower webpage 3402_2.RA 3612 then parses the HTML for webpage 3402_2 for any other links. In this example,RA 3612 identifies a link 3404_4 to a next lower level webpage 3402_5.RA 3612 then parses HTML for webpage 3402_5 for any other links. In this example, there are no additional links in webpage 3402_5. -
RA 3612 then parses the HTML in webpage 3402_1 for any additional links. In this example,RA 3612 identifies a next link 3404_2 to another lower level webpage 3402_3.RA 3612 parses the HTML in webpage 3402_3 and determines there are no additional links. -
RA 3612 further parses the HTML in webpage 3402_1 and identifies a third link 3404_3 to webpage 3402_4.RA 3612 parses the HTML in webpage 3402_4 and identifies an external link 3404_5 to a webpage located on a different resource.RA 3612 then parses the HTML on the webpage located on the other resource for other links as described above. -
RA 3612 continues crawling webpages until detecting a convergence of the same webpages on the same resources. Otherwise,RA 3612 may stop crawling through a web path if no new webpages or resources are detected after some threshold number of hops.RA 3612 then may crawl through the next link in webpage 3402_1. When all links in webpage 3402_1 are crawled,RA 3612 may start crawling the remaining links in the next webpage 3402_2. - As explained above, the different paths identified by
web RA 3612 through webpage 3402_1, such as path [2, 1, 3, 5, 8] described above inFIG. 19 , are converted by an unsupervised learning model, such as DeepWalk (Perozzi, Bryan et al. “DeepWalk: online learning of social representations.” KDD (2014)), LINE (Tang, Jian et al. “LINE: Large-scale Information Network Embedding.” WWW (2015)), or GraphSAGE (Hamilton, William L. et al. “Inductive Representation Learning on Large Graphs.” NIPS (2017)) into structural semantic vector 3608_1. - Values in vector 3608_1 may represent different structural characteristics of webpage 3402_1. For example, values in vector 3608_1 may indicate the hierarchical position of webpage 3402_1 within
resource 3401, the number of links to other webpages withinresource 3401, the number of links to other webpages outside ofresource 3401, and/or the like. Structural semantic vector 3608_1 may capture first order proximity identifying direct relationships of webpage 3402_1 with other webpages. Vector 3608_1 also may capture second order proximity identifying indirect relationships of resource 3402_1 withother resources 3401, 3402 throughintermediate resources 3401, 3402. - A natural language processor analyzes
InOb 3610 to generate a vector 3608_2 for content semantic feature F2. The natural language machine learning algorithm may identify subjects, number or words, number of topics, and/or the like in the text of resource 3402_1. The natural language processor converts the identified topics, sentence structure, word count, and/or the like into content semantic vector 3608_2. A content semantic vector 3608_2 is generated for each webpage 3402 inresource 3401. - Content semantic vectors 3608_2 for different resources 3402 can be compared to identify resource similarities and differences which may provide further insight into resource classification. For example, a cosine similarity operation may be performed for different content semantic vectors 3608_2 to determine the similarity of topics for webpages 3402 on the
same resources 3401 or to determine the similarities between topics ondifferent resources 3401. - One example machine learning algorithm for converting text from a webpage into content semantic vector 3608_2 is Word2Vec described in [Mikolov], which is herein incorporated by reference in its entirety. Converting text into a multidimensional vector space is known to those skilled in the art and is therefore not described in further detail.
- The
resource classifier 1640 may generate a vector 3608_3 for topic feature F3. As described above,content analyzer 242 inFIG. 2 above generates vectors of topic 236 (or “topic vectors 236”) for different InObs (e.g., webpages). Theresource classifier 1640 may use a same or similar content analyzer ascontent analyzer 242 to generate B2B topic vector 3608_3 for webpage 3402_1. Each value in B2B topic vector 3608_3 may indicate the probability or relevancy score of an associated business-related topic withinInOb 3610. In one example, content semantics vector 3608_2 may represent a more general language structure inInOb 3610 and B2B topic vector 3608_3 may represent a more specific set of business-related topics inInOb 3610. - In some embodiments, the
resource classifier 1640 generates a vector 3608_4 for content interaction feature F4. Vector 3608_4 identifies different user interactions with webpage 3402_1. Theresource classifier 1640 may generate vector 3608_4 by analyzing theevents 108 associated with webpage 3402_1. For example, eachevent 108 described above may include an event type 456 and engagement metric 610 identifying scroll, time duration on the webpage, time of day, day of week webpage was accessed, variance in consumption, and/or the like. Each value in vector 3608_4 may represent a percentage or average value for an associated one of the event types 456 for a specified time period. - For example, the
resource classifier 1640 may identify all of theevents 108 for a specified time period associated with webpage 3402_1. Theresource classifier 1640 may generate content interaction vector 3608_4 by identifying all of the same event types in the set ofevents 108. Theresource classifier 1640 then may identify the percentage ofevents 108 associated with each of the different event types. Theresource classifier 1640 uses each identified percentage as a different value in content interaction vector 3608_4. - For example, a first value in content interaction vector 3608_4 may indicate the percentage of events generated for webpage 3402_1 during normal work hours and a second value in content interaction vector 3608_4 may indicate the percentage or ratio of events generated for webpage 3402_1 during non-work hours. Other values in content interaction vector 3608_4 may identify any other user engagement or change of user engagement with webpage 3402_1.
- The
resource classifier 1640 generates a vector 3608_5 for entity type feature F5. Vector F5 identifies different types of users interacting with webpage 3402_1. Theresource classifier 1640 may generate vector 3608_5 by analyzing all of theevents 108 associated with webpage 3402_1. For example, eachevent 108 may include an associated IP address. As mentioned above,CCM 100 may identify the IP address as being associated with an enterprise, small-medium business (SMB), educational entity, mobile network operator, hotel, and/or the like. - The
resource classifier 1640 identifies the events 104 associated with webpage 3402_1 for a specified time period. Theresource classifier 1640 then identifies the percentage of the events associated with each of the different entity types. For example, theresource classifier 1640 may generate an entity type vector 3608_5=[0.23, 0.20, 0.30, 0.17, 0.10] where [% enterprise, % small medium business, % education, % mobile network operators, % hotels]. - As mentioned above in
FIG. 35 , Theresource classifier 1640 calculates the average for feature vectors 3608_1, 3608_2, 3608_3, 3608_4, and 3608_5 generated for all of the webpages 3402 associated with thesame resource 3401 to generate an overallresource feature vector 3406 as shown inFIG. 35 . Each of the different features F1-F5 provide additional information for more accurate site classifications. -
FIG. 37 depicts an example of how theresource classifier 1640 classifies an InOb based on structural semantic features F1. However, it should be understood that theresource classifier 1640 may classify InObs based on any combination of features F1-F6 described previously and/or any other features. - The
resource classifier 1640 may receive a set oftraining data 3720 that includes the URLs 3722 and associated structural semantic (SS)vectors 3724 for a set of known webpages. The resource classifier 1640 (or RA 3612) may analyze (e.g., crawl through) a set of resources/nodes (URLs 3722) onresources 3721 with knownclassifications 3726. For example, a knownnews website 3721A may include three webpages with URL1, 2, and 3. Theresource classifier 1640 may crawl eachURL SS vectors 3724.URLs news classification 3726A. Theresource classifier 1640 also generatesSS vectors 3724 for URL4 associated with another knownnews website 3721B, URL5 associated with a knownvendor website 3721C, and URL6 associated with a knownmarketer website 3721D. Of course,SS vectors 3724 may be generated for each webpage 3722 on each ofwebsites 3721. The operator assigns eachSS vector 3724 its knownsite classification 3726. - The
resource classifier 1640 feedstraining data 3720 that includesSS vectors 3724 and the associated knownsite classifications 3726 into anML model 3728. For example,ML model 3728 may be a logistic regression (LR) model or Random Forest model. Other types of supervised ML models can also be used in other embodiments.ML model 3728 usestraining data 3720 during atraining stage 3729 to identify the characteristics ofSS vectors 3724 associated with eachsite classification 3726. Aftermodel 3728 has completedtraining stage 3729, it then operates as a site classifier inwebsite classification stage 3730. - Structural semantic vectors 3608_1 are generated for
different resources 3401 with unknown classification as described above. SS vectors 3608_1 are fed intomodel 3728.Model 3728 generatesresource prediction values 3732 for eachresource 3401 and/or for individual InObs and/or content items making up aresource 3401. For example,ML model 3728 may predict the website associated with URL6 as having a 0.3 likelihood of being a news website, 0.1 likelihood of a vendor website, and a 0.5 likelihood of a marketer website. -
FIG. 38 depicts an example of theresource classifier 1640 usingmultiple feature vectors 3608 to classify resource(s) 3401. In this example,website 3401 is associated with a resource identifier (e.g., URL6). Theresource classifier 1640 generates vector 3608_1 from the structural semantic features F1 of the content/InObs of the resource 3401 (e.g., webpages of a website), and generates vector 3608_2 from the content semantic features F2 of the content/InObs of theresource 3401. Theresource classifier 1640 generates vector 3608_3 from the topic features F3 identified in the content/InObs of theresource 3401. Theresource classifier 1640 analyzes the events associated with each content/InObs of theresource 3401 and generates vector 3608_4 from the user interaction features F4. Theresource classifier 1640 generates vector 3608_5 from the entity type features F5 associated with the content/InObs of theresource 3401. -
ML model 3728 is trained as explained previously with any combination of vectors 3608_1, 3608_2, 3608_3, 3608_4, and/or 3608_5 generated from theresource 3401 with known classifications.Vectors 3608 are generated from theresource 3401 with an unknown classification and fed into a ML trainedclassifier model 3728.Model 3728 generatessite predictions 3732 for theresource 3401. In this example,model 3728 may more accurately predict theresource 3401 as being a marketer website due to the additional features F2, F3, F4, and F5 used for classifying theresource 3401. - As mentioned, the
classifications 3732 can be used as another event dimension for determining user or org intent and surge scores. For example, a large surge score from a vendor website may have more significance for identifying a company surge than a similar surge score on a news or marketing website.Resource classifications 3732 can also be used for filtering different types of data. For example,CCM 100 can capture and determine surge scores fromevents 108 generated for one particular website class. -
FIG. 39 illustrates an example of an computing system 3900 (also referred to as “computing device 3900,” “platform 3900,” “device 3900,” “appliance 3900,” “server 3900,” or the like) in accordance with various embodiments. Thecomputing system 3900 may be suitable for use as any of the computer devices discussed herein and performing any combination of processes discussed above. As examples, thecomputing device 3900 may operate in the capacity of a server or a client machine in a server-client network environment, or as a peer machine in a peer-to-peer (or distributed) network environment. Additionally or alternatively, thesystem 3900 may represent theCCM 100, user computer(s) 230, 530, 1400, a network device and/or network appliance, application server(s) (e.g., owned/operated by service providers 118), a third party platform or collection of servers that hosts and/or serves information objects 112, and/or any other system or device discussed previously. Additionally or alternatively, various combinations of the components depicted byFIG. 39 may be included depending on the particular system/device thatsystem 3900 represents. For example, whensystem 3900 represents a user or client device, thesystem 3900 may include some or all of the components shown byFIG. 39 . In another example, when thesystem 3900 is theCCM 100 or a server computer system, thesystem 3900 may not include thecommunication circuitry 3909 orbattery 3924, and instead may includemultiple NICs 3916 or the like. As examples, thesystem 3900 and/or theremote system 3955 may comprise desktop computers, workstations, laptop computers, mobile cellular phones (e.g., “smartphones”), tablet computers, portable media players, wearable computing devices, server computer systems, web appliances, network appliances, an aggregation of computing resources (e.g., in a cloud-based environment), or some other computing devices capable of interfacing directly or indirectly withnetwork 3950 or other network, and/or any other machine or device capable of executing instructions (sequential or otherwise) that specify actions to be taken by that machine. - The components of
system 3900 may be implemented as an individual computer system, or as components otherwise incorporated within a chassis of a larger system. The components ofsystem 3900 may be implemented as integrated circuits (ICs) or other discrete electronic devices, with the appropriate logic, software, firmware, or a combination thereof, adapted in thecomputer system 3900. Additionally or alternatively, some of the components ofsystem 3900 may be combined and implemented as a suitable System-on-Chip (SoC), System-in-Package (SiP), multi-chip package (MCP), or the like. - The
system 3900 includes physical hardware devices and software components capable of providing and/or accessing content and/or services to/from theremote system 3955. Thesystem 3900 and/or theremote system 3955 can be implemented as any suitable computing system or other data processing apparatus usable to access and/or provide content/services from/to one another. Theremote system 3955 may have a same or similar configuration and/or the same or similar components assystem 3900. Thesystem 3900 communicates withremote systems 3955, and vice versa, to obtain/serve content/services using, for example, Hypertext Transfer Protocol (HTTP) over Transmission Control Protocol (TCP)/Internet Protocol (IP), or one or more other common Internet protocols such as File Transfer Protocol (FTP); Session Initiation Protocol (SIP) with Session Description Protocol (SDP), Real-time Transport Protocol (RTP), or Real-time Streaming Protocol (RTSP); Secure Shell (SSH), Extensible Messaging and Presence Protocol (XMPP); WebSocket; and/or some other communication protocol, such as those discussed herein. In some examples, theevents 108 may be or include session events defined by any of the aforementioned protocols. - As used herein, the term “content” refers to visual or audible information to be conveyed to a particular audience or end-user, and may include or convey information pertaining to specific subjects or topics. Content or content items may be different content types (e.g., text, image, audio, video, and/or the like), and/or may have different formats (e.g., text files including Microsoft® Word® documents, Portable Document Format (PDF) documents, HTML documents; audio files such as MPEG-4 audio files and WebM audio and/or video files; and/or the like). As used herein, the term “service” refers to a particular functionality or a set of functions to be performed on behalf of a requesting party, such as the
system 3900. As examples, a service may include or involve the retrieval of specified information or the execution of a set of operations. In order to access the content/services, thesystem 3900 includes components such as processors, memory devices, communication interfaces, and the like. However, the terms “content” and “service” may be used interchangeably throughout the present disclosure even though these terms refer to different concepts. - Referring now to
system 3900, thesystem 3900 includesprocessor circuitry 3902, which is configurable or operable to execute program code, and/or sequentially and automatically carry out a sequence of arithmetic or logical operations; record, store, and/or transfer digital data. Theprocessor circuitry 3902 includes circuitry such as, but not limited to one or more processor cores and one or more of cache memory, low drop-out voltage regulators (LDOs), interrupt controllers, serial interfaces such as serial peripheral interface (SPI), inter-integrated circuit (I2C) or universal programmable serial interface circuit, real time clock (RTC), timer-counters including interval and watchdog timers, general purpose input-output (I/O), memory card controllers, interconnect (IX) controllers and/or interfaces, universal serial bus (USB) interfaces, mobile industry processor interface (MIPI) interfaces, Joint Test Access Group (JTAG) test access ports, and the like. Theprocessor circuitry 3902 may include on-chip memory circuitry or cache memory circuitry, which may include any suitable volatile and/or non-volatile memory, such as DRAM, SRAM, EPROM, EEPROM, Flash memory, solid-state memory, and/or any other type of memory device technology, such as those discussed herein. Individual processors (or individual processor cores) of theprocessor circuitry 3902 may be coupled with or may include memory/storage and may be configurable or operable to execute instructions stored in the memory/storage to enable various applications or operating systems to run on thesystem 3900. In these embodiments, the processors (or cores) of theprocessor circuitry 3902 are configurable or operable to operate application software (e.g., logic/modules 3980) to provide specific services to a user of thesystem 3900. In some embodiments, theprocessor circuitry 3902 may include special-purpose processor/controller to operate according to the various embodiments herein. - In various implementations, the processor(s) of
processor circuitry 3902 may include, for example, one or more processor cores (CPUs), graphics processing units (GPUs), Tensor Processing Units (TPUs), reduced instruction set computing (RISC) processors, Acorn RISC Machine (ARM) processors, complex instruction set computing (CISC) processors, digital signal processors (DSP), programmable logic devices (PLDs), field-programmable gate arrays (FPGAs), Application Specific Integrated Circuits (ASICs), SoCs and/or programmable SoCs, microprocessors or controllers, or any suitable combination thereof. As examples, theprocessor circuitry 3902 may include Intel® Core™ based processor(s), MCU-class processor(s), Xeon® processor(s); Advanced Micro Devices (AMD) Zen® Core Architecture processor(s), such as Ryzen® or Epyc® processor(s), Accelerated Processing Units (APUs), MxGPUs, or the like; A, S, W, and T series processor(s) from Apple® Inc., Snapdragon™ or Centrig™ processor(s) from Qualcomm® Technologies, Inc., Texas Instruments, Inc.® Open Multimedia Applications Platform (OMAP)™ processor(s); Power Architecture processor(s) provided by the OpenPOWER® Foundation and/or IBM®, MIPS Warrior M-class, Warrior I-class, and Warrior P-class processor(s) provided by MIPS Technologies, Inc.; ARM Cortex-A, Cortex-R, and Cortex-M family of processor(s) as licensed from ARM Holdings, Ltd.; the ThunderX2® provided by Cavium™, Inc.; GeForce®, Tegra®, Titan X®, Tesla®, Shield®, and/or other like GPUs provided by Nvidia®; or the like. Other examples of theprocessor circuitry 3902 may be mentioned elsewhere in the present disclosure. - In some implementations, the processor(s) of
processor circuitry 3902 may be, or may include, one or more media processors comprising microprocessor-based SoC(s), FPGA(s), or DSP(s) specifically designed to deal with digital streaming data in real-time, which may include encoder/decoder circuitry to compress/decompress (or encode and decode) Advanced Video Coding (AVC) (also known as H.264 and MPEG-4) digital data, High Efficiency Video Coding (HEVC) (also known as H.265 and MPEG-H part 2) digital data, and/or the like. - In some implementations, the
processor circuitry 3902 may include one or more hardware accelerators. The hardware accelerators may be microprocessors, configurable hardware (e.g., FPGAs, programmable ASICs, programmable SoCs, DSPs, and/or the like), or some other suitable special-purpose processing device tailored to perform one or more specific tasks or workloads, for example, specific tasks or workloads of the subsystems of theCCM 100 and/or some other system/device discussed herein, which may be more efficient than using general-purpose processor cores. In some embodiments, the specific tasks or workloads may be offloaded from one or more processors of theprocessor circuitry 3902. In these implementations, the circuitry ofprocessor circuitry 3902 may comprise logic blocks or logic fabric including and other interconnected resources that may be programmed to perform various functions, such as the procedures, methods, functions, and/or the like of the various embodiments discussed herein. Additionally, theprocessor circuitry 3902 may include memory cells (e.g., EPROM, EEPROM, flash memory, static memory (e.g., SRAM, anti-fuses, and/or the like) used to store logic blocks, logic fabric, data, and/or the like in LUTs and the like. - In some implementations, the
processor circuitry 3902 may include hardware elements specifically tailored for machine learning functionality, such as for operating the subsystems of theCCM 100 discussed previously with regard toFIG. 2 . In these implementations, theprocessor circuitry 3902 may be, or may include, an AI engine chip that can run many different kinds of AI instruction sets once loaded with the appropriate weightings and training code. Additionally or alternatively, theprocessor circuitry 3902 may be, or may include, AI accelerator(s), which may be one or more of the aforementioned hardware accelerators designed for hardware acceleration of AI applications, such as one or more of the subsystems of theCCM 100 and/or some other system/device discussed herein. As examples, these processor(s) or accelerators may be a cluster of artificial intelligence (AI) GPUs, tensor processing units (TPUs) developed by Google® Inc., Real AI Processors (RAPs™) provided by AlphalCs®, Nervana™ Neural Network Processors (NNPs) provided by Intel® Corp., Intel® Movidius™ Myriad™ X Vision Processing Unit (VPU), NVIDIA® PX™ based GPUs, the NM500 chip provided by General Vision®,Hardware 3 provided by Tesla®, Inc., an Epiphany™ based processor provided by Adapteva®, or the like. In some embodiments, theprocessor circuitry 3902 and/or hardware accelerator circuitry may be implemented as AI accelerating co-processor(s), such as the Hexagon 685 DSP provided by Qualcomm®, the PowerVR 2NX Neural Net Accelerator (NNA) provided by Imagination Technologies Limited®, the Neural Engine core within the Apple® A11 or A12 Bionic SoC, the Neural Processing Unit (NPU) within the HiSilicon Kirin 970 provided by Huawei®, and/or the like. - In some implementations, the processor(s) of
processor circuitry 3902 may be, or may include, one or more custom-designed silicon cores specifically designed to operate corresponding subsystems of theCCM 100 and/or some other system/device discussed herein. These cores may be designed as synthesizable cores comprising hardware description language logic (e.g., register transfer logic, verilog, Very High Speed Integrated Circuit hardware description language (VHDL), and/or the like); netlist cores comprising gate-level description of electronic components and connections and/or process-specific very-large-scale integration (VLSI) layout; and/or analog or digital logic in transistor-layout format. In these implementations, one or more of the subsystems of theCCM 100 and/or some other system/device discussed herein may be operated, at least in part, on custom-designed silicon core(s). These “hardware-ized” subsystems may be integrated into a larger chipset but may be more efficient that using general purpose processor cores. - The
system memory circuitry 3904 comprises any number of memory devices arranged to provide primary storage from which theprocessor circuitry 3902 continuously readsinstructions 3982 stored therein for execution. In some embodiments, thememory circuitry 3904 is on-die memory or registers associated with theprocessor circuitry 3902. As examples, thememory circuitry 3904 may include volatile memory such as random access memory (RAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), and/or the like. Thememory circuitry 3904 may also include nonvolatile memory (NVM) such as high-speed electrically erasable memory (commonly referred to as “flash memory”), phase change RAM (PRAM), resistive memory such as magnetoresistive random access memory (MRAM), and/or the like. Thememory circuitry 3904 may also comprise persistent storage devices, which may be temporal and/or persistent storage of any type, including, but not limited to, non-volatile memory, optical, magnetic, and/or solid state mass storage, and so forth. - In some implementations, some aspects (or devices) of
memory circuitry 3904 andstorage circuitry 3908 may be integrated together with aprocessing device 3902, for example RAM or FLASH memory disposed within an integrated circuit microprocessor or the like. In other implementations, thememory circuitry 3904 and/orstorage circuitry 3908 may comprise an independent device, such as an external disk drive, storage array, or any other storage devices used in database systems. The memory and processing devices may be operatively coupled together, or in communication with each other, for example by an I/O port, network connection, and/or the like such that the processing device may read a file stored on the memory. - Some memory may be “read only” by design (ROM) by virtue of permission settings, or not. Other examples of memory may include, but may be not limited to, WORM, EPROM, EEPROM, FLASH, and/or the like. which may be implemented in solid state semiconductor devices. Other memories may comprise moving parts, such a conventional rotating disk drive. All such memories may be “machine-readable” in that they may be readable by a processing device.
-
Storage circuitry 3908 is arranged to provide persistent storage of information such as data, applications, operating systems (OS), and so forth. As examples, thestorage circuitry 3908 may be implemented as hard disk drive (HDD), a micro HDD, a solid-state disk drive (SSDD), flash memory cards (e.g., SD cards, microSD cards, xD picture cards, and the like), USB flash drives, on-die memory or registers associated with theprocessor circuitry 3902, resistance change memories, phase change memories, holographic memories, or chemical memories, and the like. - The
storage circuitry 3908 is configurable or operable to store computational logic 3980 (or “modules 3980”) in the form of software, firmware, microcode, or hardware-level instructions to implement the techniques described herein. Thecomputational logic 3980 may be employed to store working copies and/or permanent copies of programming instructions, or data to create the programming instructions, for the operation of various components of system 3900 (e.g., drivers, libraries, application programming interfaces (APIs), and/or the like), an OS ofsystem 3900, one or more applications, and/or for carrying out the embodiments discussed herein. Thecomputational logic 3980 may be stored or loaded intomemory circuitry 3904 asinstructions 3982, or data to create theinstructions 3982, which are then accessed for execution by theprocessor circuitry 3902 to carry out the functions described herein. Theprocessor circuitry 3902 accesses thememory circuitry 3904 and/or thestorage circuitry 3908 over the interconnect (IX) 3906. Theinstructions 3982 to direct theprocessor circuitry 3902 to perform a specific sequence or flow of actions, for example, as described with respect to flowchart(s) and block diagram(s) of operations and functionality depicted previously. The various elements may be implemented by assembler instructions supported byprocessor circuitry 3902 or high-level languages that may be compiled intoinstructions 3984, or data to create theinstructions 3984, to be executed by theprocessor circuitry 3902. The permanent copy of the programming instructions may be placed into persistent storage devices ofstorage circuitry 3908 in the factory or in the field through, for example, a distribution medium (not shown), through a communication interface (e.g., from a distribution server (not shown)), or over-the-air (OTA). - The operating system (OS) of
system 3900 may be a general purpose OS or an OS specifically written for and tailored to thecomputing system 3900. For example, when thesystem 3900 is a server system or a desktop orlaptop system 3900, the OS may be Unix or a Unix-like OS such as Linux e.g., provided by Red Hat Enterprise, Windows 10™ provided by Microsoft Corp.®, macOS provided by Apple Inc.®, or the like. In another example where thesystem 3900 is a mobile device, the OS may be a mobile OS, such as Android® provided by Google Inc.®, iOS® provided by Apple Inc.®, Windows 10 Mobile® provided by Microsoft Corp.®, KaiOS provided by KaiOS Technologies Inc., or the like. - The OS manages computer hardware and software resources, and provides common services for various applications (e.g., one or more loci/modules 3980). The OS may include one or more drivers or APIs that operate to control particular devices that are embedded in the
system 3900, attached to thesystem 3900, or otherwise communicatively coupled with thesystem 3900. The drivers may include individual drivers allowing other components of thesystem 3900 to interact or control various I/O devices that may be present within, or connected to, thesystem 3900. For example, the drivers may include a display driver to control and allow access to a display device, a touchscreen driver to control and allow access to a touchscreen interface of thesystem 3900, sensor drivers to obtain sensor readings ofsensor circuitry 3921 and control and allow access tosensor circuitry 3921, actuator drivers to obtain actuator positions of theactuators 3922 and/or control and allow access to theactuators 3922, a camera driver to control and allow access to an embedded image capture device, audio drivers to control and allow access to one or more audio devices. The OSs may also include one or more libraries, drivers, APIs, firmware, middleware, software glue, and/or the like, which provide program code and/or software components for one or more applications to obtain and use the data from other applications operated by thesystem 3900, such as the various subsystems of theCCM 100 and/or some other system/device discussed previously. - The components of
system 3900 communicate with one another over the interconnect (IX) 3906. TheIX 3906 may include any number of IX technologies such as industry standard architecture (ISA), extended ISA (EISA), inter-integrated circuit (I2C), an serial peripheral interface (SPI), point-to-point interfaces, power management bus (PMBus), peripheral component interconnect (PCI), PCI express (PCIe), Intel® Ultra Path Interface (UPI), Intel® Accelerator Link (IAL), Common Application Programming Interface (CAPI), Intel® QuickPath Interconnect (QPI), Intel® Omni-Path Architecture (OPA) IX, RapidIO™ system interconnects, Ethernet, Cache Coherent Interconnect for Accelerators (CCIA), Gen-Z Consortium IXs, Open Coherent Accelerator Processor Interface (OpenCAPI), and/or any number of other IX technologies. TheIX 3906 may be a proprietary bus, for example, used in a SoC based system. - The
communication circuitry 3909 is a hardware element, or collection of hardware elements, used to communicate over one or more networks (e.g., network 3950) and/or with other devices. Thecommunication circuitry 3909 includesmodem 3910 and transceiver circuitry (“TRx”) 812. Themodem 3910 includes one or more processing devices (e.g., baseband processors) to carry out various protocol and radio control functions.Modem 3910 may interface with application circuitry of system 3900 (e.g., a combination ofprocessor circuitry 3902 and CRM 860) for generation and processing of baseband signals and for controlling operations of theTRx 3912. Themodem 3910 may handle various radio control functions that enable communication with one or more radio networks via theTRx 3912 according to one or more wireless communication protocols. Themodem 3910 may include circuitry such as, but not limited to, one or more single-core or multi-core processors (e.g., one or more baseband processors) or control logic to process baseband signals received from a receive signal path of theTRx 3912, and to generate baseband signals to be provided to theTRx 3912 via a transmit signal path. In various embodiments, themodem 3910 may implement a real-time OS (RTOS) to manage resources of themodem 3910, schedule tasks, and/or the like. - The
communication circuitry 3909 also includesTRx 3912 to enable communication with wireless networks using modulated electromagnetic radiation through a non-solid medium.TRx 3912 includes a receive signal path, which comprises circuitry to convert analog RF signals (e.g., an existing or received modulated waveform) into digital baseband signals to be provided to themodem 3910. TheTRx 3912 also includes a transmit signal path, which comprises circuitry configurable or operable to convert digital baseband signals provided by themodem 3910 to be converted into analog RF signals (e.g., modulated waveform) that will be amplified and transmitted via an antenna array including one or more antenna elements (not shown). The antenna array may be a plurality of microstrip antennas or printed antennas that are fabricated on the surface of one or more printed circuit boards. The antenna array may be formed in as a patch of metal foil (e.g., a patch antenna) in a variety of shapes, and may be coupled with theTRx 3912 using metal transmission lines or the like. - The TRx 3912 may include one or more radios that are compatible with, and/or may operate according to any one or more of the following radio communication technologies and/or standards including but not limited to: a Global System for Mobile Communications (GSM) radio communication technology, a General Packet Radio Service (GPRS) radio communication technology, an Enhanced Data Rates for GSM Evolution (EDGE) radio communication technology, and/or a Third Generation Partnership Project (3GPP) radio communication technology, for example Universal Mobile Telecommunications System (UMTS), Freedom of Multimedia Access (FOMA), 3GPP Long Term Evolution (LTE), 3GPP Long Term Evolution Advanced (LTE Advanced), Code division multiple access 2000 (CDM2000), Cellular Digital Packet Data (CDPD), Mobitex, Third Generation (3G), Circuit Switched Data (CSD), High-Speed Circuit-Switched Data (HSCSD), Universal Mobile Telecommunications System (Third Generation) (UMTS (3G)), Wideband Code Division Multiple Access (Universal Mobile Telecommunications System) (W-CDMA (UMTS)), High Speed Packet Access (HSPA), High-Speed Downlink Packet Access (HSDPA), High-Speed Uplink Packet Access (HSUPA), High Speed Packet Access Plus (HSPA+), Universal Mobile Telecommunications System-Time-Division Duplex (UMTS-TDD), Time Division-Code Division Multiple Access (TD-CDMA), Time Division-Synchronous Code Division Multiple Access (TD-CDMA), 3rd Generation Partnership Project Release 8 (Pre-4th Generation) (3GPP Rel. 8 (Pre-4G)), 3GPP Rel. 0 (3rd Generation Partnership Project Release 9), 3GPP Rel. 10 (3rd Generation Partnership Project Release 10), 3GPP Rel. 11 (3rd Generation Partnership Project Release 11), 3GPP Rel. 12 (3rd Generation Partnership Project Release 12), 3GPP Rel. 8 (3rd Generation Partnership Project Release 8), 3GPP Rel. 14 (3rd Generation Partnership Project Release 14), 3GPP Rel. 15 (3rd Generation Partnership Project Release 15), 3GPP Rel. 16 (3rd Generation Partnership Project Release 16), 3GPP Rel. 17 (3rd Generation Partnership Project Release 17) and subsequent Releases (such as Rel. 18, Rel. 19, and/or the like), 3GPP 5G, 3GPP LTE Extra, LTE-Advanced Pro, LTE Licensed-Assisted Access (LAA), MuLTEfire, UMTS Terrestrial Radio Access (UTRA), Evolved UMTS Terrestrial Radio Access (E-UTRA), Long Term Evolution Advanced (4th Generation) (LTE Advanced (4G)), cdmaOne (2G), Code division multiple access 2000 (Third generation) (CDM2000 (3G)), Evolution-Data Optimized or Evolution-Data Only (EV-DO), Advanced Mobile Phone System (1st Generation) (AMPS (1G)), Total Access Communication System/Extended Total Access Communication System (TACS/ETACS), Digital AMPS (2nd Generation) (D-AMPS (2G)), Push-to-talk (PTT), Mobile Telephone System (MTS), Improved Mobile Telephone System (IMTS), Advanced Mobile Telephone System (AMTS), OLT (Norwegian for Offentlig Landmobil Telefoni, Public Land Mobile Telephony), MTD (Swedish abbreviation for Mobiltelefonisystem D, or Mobile telephony system D), Public Automated Land Mobile (Autotel/PALM), ARP (Finnish for Autoradiopuhelin, “car radio phone”), NMT (Nordic Mobile Telephony), High capacity version of NTT (Nippon Telegraph and Telephone) (Hicap), Cellular Digital Packet Data (CDPD), Mobitex, DataTAC, Integrated Digital Enhanced Network (iDEN), Personal Digital Cellular (PDC), Circuit Switched Data (CSD), Personal Handy-phone System (PHS), Wideband Integrated Digital Enhanced Network (WiDEN), iBurst, Unlicensed Mobile Access (UMA), also referred to as also referred to as 3GPP Generic Access Network, or GAN standard), Bluetooth(r), Bluetooth Low Energy (BLE), IEEE 802.15.4 based protocols (e.g., IPv6 over Low power Wireless Personal Area Networks (6LoWPAN), WirelessHART, MiWi, Thread, 1600.11a, and/or the like) WiFi-direct, ANT/ANT+, ZigBee, Z-Wave, 3GPP device-to-device (D2D) or Proximity Services (ProSe), Universal Plug and Play (UPnP), Low-Power Wide-Area-Network (LPWAN), LoRaWAN™ (Long Range Wide Area Network), Sigfox, Wireless Gigabit Alliance (WiGig) standard, mmWave standards in general (wireless systems operating at 10-300 GHz and above such as WiGig, IEEE 802.11ad, IEEE 802.11 ay, and/or the like), technologies operating above 300 GHz and THz bands, (3GPP/LTE based or IEEE 802.11p and other) Vehicle-to-Vehicle (V2V) and Vehicle-to-X (V2X) and Vehicle-to-Infrastructure (V2I) and Infrastructure-to-Vehicle (I2V) communication technologies, 3GPP cellular V2X, DSRC (Dedicated Short Range Communications) communication systems such as Intelligent-Transport-Systems and others, the European ITS-G5 system (i.e. the European flavor of IEEE 802.11p based DSRC, including ITS-G5A (i.e., Operation of ITS-G5 in European ITS frequency bands dedicated to ITS for safety related applications in the frequency range 5,875 GHz to 5,905 GHz), ITS-G5B (i.e., Operation in European ITS frequency bands dedicated to ITS non-safety applications in the frequency range 5,855 GHz to 5,875 GHz), ITS-G5C (i.e., Operation of ITS applications in the frequency range 5,470 GHz to 5,725 GHz)), and/or the like. In addition to the standards listed above, any number of satellite uplink technologies may be used for the
TRx 3912 including, for example, radios compliant with standards issued by the ITU (International Telecommunication Union), or the ETSI (European Telecommunications Standards Institute), among others, both existing and not yet formulated. - Network interface circuitry/controller (NIC) 3916 may be included to provide wired communication to the
network 3950 or to other devices using a standard network interface protocol. The standard network interface protocol may include Ethernet, Ethernet over GRE Tunnels, Ethernet over Multiprotocol Label Switching (MPLS), Ethernet over USB, or may be based on other types of network protocols, such as Controller Area Network (CAN), Local Interconnect Network (LIN), DeviceNet, ControlNet, Data Highway+, PROFIBUS, or PROFINET, among many others. Network connectivity may be provided to/from thesystem 3900 viaNIC 3916 using a physical connection, which may be electrical (e.g., a “copper interconnect”) or optical. The physical connection also includes suitable input connectors (e.g., ports, receptacles, sockets, and/or the like) and output connectors (e.g., plugs, pins, and/or the like). TheNIC 3916 may include one or more dedicated processors and/or FPGAs to communicate using one or more of the aforementioned network interface protocols. In some implementations, theNIC 3916 may include multiple controllers to provide connectivity to other networks using the same or different protocols. For example, thesystem 3900 may include afirst NIC 3916 providing communications to the cloud over Ethernet and asecond NIC 3916 providing communications to other devices over another type of network. In some implementations, theNIC 3916 may be a high-speed serial interface (HSSI) NIC to connect thesystem 3900 to a routing or switching device. -
Network 3950 comprises computers, network connections among various computers (e.g., between thesystem 3900 and remote system 3955), and software routines to enable communication between the computers over respective network connections. In this regard, thenetwork 3950 comprises one or more network elements that may include one or more processors, communications systems (e.g., including network interface controllers, one or more transmitters/receivers connected to one or more antennas, and/or the like), and computer readable media. Examples of such network elements may include wireless access points (WAPs), a home/business server (with or without radio frequency (RF) communications circuitry), a router, a switch, a hub, a radio beacon, base stations, picocell or small cell base stations, and/or any other like network device. Connection to thenetwork 3950 may be via a wired or a wireless connection using the various communication protocols discussed infra. As used herein, a wired or wireless communication protocol may refer to a set of standardized rules or instructions implemented by a communication device/system to communicate with other devices, including instructions for packetizing/depacketizing data, modulating/demodulating signals, implementation of protocols stacks, and the like. More than one network may be involved in a communication session between the illustrated devices. Connection to thenetwork 3950 may require that the computers execute software routines which enable, for example, the seven layers of the OSI model of computer networking or equivalent in a wireless (or cellular) phone network. - The
network 3950 may represent the Internet, one or more cellular networks, a local area network (LAN) or a wide area network (WAN) including proprietary and/or enterprise networks, Transfer Control Protocol (TCP)/Internet Protocol (IP)-based network, or combinations thereof. In such embodiments, thenetwork 3950 may be associated with network operator who owns or controls equipment and other elements necessary to provide network-related services, such as one or more base stations or access points, one or more servers for routing digital data or telephone calls (e.g., a core network or backbone network), and/or the like. Other networks can be used instead of or in addition to the Internet, such as an intranet, an extranet, a virtual private network (VPN), an enterprise network, a non-TCP/IP based network, any LAN or WAN or the like. - The external interface 3918 (also referred to as “I/O interface circuitry” or the like) is configurable or operable to connect or coupled the
system 3900 with external devices or subsystems. Theexternal interface 3918 may include any suitable interface controllers and connectors to couple thesystem 3900 with the external components/devices. As an example, theexternal interface 3918 may be an external expansion bus (e.g., Universal Serial Bus (USB), FireWire, Thunderbolt, and/or the like) used to connectsystem 3900 with external (peripheral) components/devices. The external devices include, inter alia,sensor circuitry 3921,actuators 3922, andpositioning circuitry 3945, but may also include other devices or subsystems not shown byFIG. 39 . - The
sensor circuitry 3921 may include devices, modules, or subsystems whose purpose is to detect events or changes in its environment and send the information (sensor data) about the detected events to some other a device, module, subsystem, and/or the like. Examples ofsuch sensors 3921 include, inter alia, inertia measurement units (IMU) comprising accelerometers, gyroscopes, and/or magnetometers; microelectromechanical systems (MEMS) or nanoelectromechanical systems (NEMS) comprising 3-axis accelerometers, 3-axis gyroscopes, and/or magnetometers; level sensors; flow sensors; temperature sensors (e.g., thermistors); pressure sensors; barometric pressure sensors; gravimeters; altimeters; image capture devices (e.g., cameras); light detection and ranging (LiDAR) sensors; proximity sensors (e.g., infrared radiation detector and the like), depth sensors, ambient light sensors, ultrasonic transceivers; microphones; and/or the like. - The
external interface 3918 connects thesystem 3900 toactuators 3922, which allowsystem 3900 to change its state, position, and/or orientation, or move or control a mechanism or system. Theactuators 3922 comprise electrical and/or mechanical devices for moving or controlling a mechanism or system, and/or converting energy (e.g., electric current or moving air and/or liquid) into some kind of motion. Theactuators 3922 may include one or more electronic (or electrochemical) devices, such as piezoelectric biomorphs, solid state actuators, solid state relays (SSRs), shape-memory alloy-based actuators, electroactive polymer-based actuators, relay driver integrated circuits (ICs), and/or the like. Theactuators 3922 may include one or more electromechanical devices such as pneumatic actuators, hydraulic actuators, electromechanical switches including electromechanical relays (EMRs), motors (e.g., DC motors, stepper motors, servomechanisms, and/or the like), wheels, thrusters, propellers, claws, clamps, hooks, an audible sound generator, and/or other like electromechanical components. Thesystem 3900 may be configurable or operable to operate one ormore actuators 3922 based on one or more captured events and/or instructions or control signals received from a service provider and/or various client systems. In embodiments, thesystem 3900 may transmit instructions to various actuators 3922 (or controllers that control one or more actuators 3922) to reconfigure an electrical network as discussed herein. - The
positioning circuitry 3945 includes circuitry to receive and decode signals transmitted/broadcasted by a positioning network of a global navigation satellite system (GNSS). Examples of navigation satellite constellations (or GNSS) include United States' Global Positioning System (GPS), Russia's Global Navigation System (GLONASS), the European Union's Galileo system, China's BeiDou Navigation Satellite System, a regional navigation system or GNSS augmentation system (e.g., Navigation with Indian Constellation (NAVIC), Japan's Quasi-Zenith Satellite System (QZSS), France's Doppler Orbitography and Radio-positioning Integrated by Satellite (DORIS), and/or the like), or the like. Thepositioning circuitry 3945 comprises various hardware elements (e.g., including hardware devices such as switches, filters, amplifiers, antenna elements, and the like to facilitate OTA communications) to communicate with components of a positioning network, such as navigation satellite constellation nodes. In some embodiments, thepositioning circuitry 3945 may include a Micro-Technology for Positioning, Navigation, and Timing (Micro-PNT) IC that uses a master timing clock to perform position tracking/estimation without GNSS assistance. Thepositioning circuitry 3945 may also be part of, or interact with, thecommunication circuitry 3909 to communicate with the nodes and components of the positioning network. Thepositioning circuitry 3945 may also provide position data and/or time data to the application circuitry, which may use the data to synchronize operations with various infrastructure (e.g., radio base stations), for turn-by-turn navigation, or the like. - The input/output (I/O)
devices 3956 may be present within, or connected to, thesystem 3900. The I/O devices 3956 include input device circuitry and output device circuitry including one or more user interfaces designed to enable user interaction with thesystem 3900 and/or peripheral component interfaces designed to enable peripheral component interaction with thesystem 3900. The input device circuitry includes any physical or virtual means for accepting an input including, inter alia, one or more physical or virtual buttons (e.g., a reset button), a physical keyboard, keypad, mouse, touchpad, touchscreen, microphones, scanner, headset, and/or the like. The output device circuitry is used to show or convey information, such as sensor readings, actuator position(s), or other like information. Data and/or graphics may be displayed on one or more user interface components of the output device circuitry. The output device circuitry may include any number and/or combinations of audio or visual display, including, inter alia, one or more simple visual outputs/indicators (e.g., binary status indicators (e.g., light emitting diodes (LEDs)) and multi-character visual outputs, or more complex outputs such as display devices or touchscreens (e.g., Liquid Chrystal Displays (LCD), LED displays, quantum dot displays, projectors, and/or the like), with the output of characters, graphics, multimedia objects, and the like being generated or produced from the operation of thesystem 3900. The output device circuitry may also include speakers or other audio emitting devices, printer(s), and/or the like. In some embodiments, thesensor circuitry 3921 may be used as the input device circuitry (e.g., an image capture device, motion capture device, or the like) and one ormore actuators 3922 may be used as the output device circuitry (e.g., an actuator to provide haptic feedback or the like). In another example, near-field communication (NFC) circuitry comprising an NFC controller coupled with an antenna element and a processing device may be included to read electronic tags and/or connect with another NFC-enabled device. Peripheral component interfaces may include, but are not limited to, a non-volatile memory port, a universal serial bus (USB) port, an audio jack, a power supply interface, and/or the like. - A
battery 3924 may be coupled to thesystem 3900 to power thesystem 3900, which may be used in embodiments where thesystem 3900 is not in a fixed location, such as when thesystem 3900 is a mobile or laptop client system. Thebattery 3924 may be a lithium ion battery, a lead-acid automotive battery, or a metal-air battery, such as a zinc-air battery, an aluminum-air battery, a lithium-air battery, a lithium polymer battery, and/or the like. In embodiments where thesystem 3900 is mounted in a fixed location, such as when the system is implemented as a server computer system, thesystem 3900 may have a power supply coupled to an electrical grid. In these embodiments, thesystem 3900 may include power tee circuitry to provide for electrical power drawn from a network cable to provide both power supply and data connectivity to thesystem 3900 using a single cable. - Power management integrated circuitry (PMIC) 3926 may be included in the
system 3900 to track the state of charge (SoCh) of thebattery 3924, and to control charging of thesystem 3900. ThePMIC 3926 may be used to monitor other parameters of thebattery 3924 to provide failure predictions, such as the state of health (SoH) and the state of function (SoF) of thebattery 3924. ThePMIC 3926 may include voltage regulators, surge protectors, power alarm detection circuitry. The power alarm detection circuitry may detect one or more of brown out (under-voltage) and surge (over-voltage) conditions. ThePMIC 3926 may communicate the information on thebattery 3924 to theprocessor circuitry 3902 over theIX 3906. ThePMIC 3926 may also include an analog-to-digital (ADC) convertor that allows theprocessor circuitry 3902 to directly monitor the voltage of thebattery 3924 or the current flow from thebattery 3924. The battery parameters may be used to determine actions that thesystem 3900 may perform, such as transmission frequency, mesh network operation, sensing frequency, and the like. - A
power block 3928, or other power supply coupled to an electrical grid, may be coupled with thePMIC 3926 to charge thebattery 3924. In some examples, thepower block 3928 may be replaced with a wireless power receiver to obtain the power wirelessly, for example, through a loop antenna in thesystem 3900. In these implementations, a wireless battery charging circuit may be included in thePMIC 3926. The specific charging circuits chosen depend on the size of thebattery 3924 and the current required. - The
system 3900 may include any combinations of the components shown byFIG. 39 , however, some of the components shown may be omitted, additional components may be present, and different arrangement of the components shown may occur in other implementations. In one example where thesystem 3900 is or is part of a server computer system, thebattery 3924,communication circuitry 3909, thesensors 3921,actuators 3922, and/orPOS 3945, and possibly some or all of the I/O devices 3956 may be omitted. - Furthermore, the embodiments of the present disclosure may take the form of a computer program product or data to create the computer program, with the computer program or data embodied in any tangible or non-transitory medium of expression having the computer-usable program code (or data to create the computer program) embodied in the medium. For example, the
memory circuitry 3904 and/orstorage circuitry 3908 may be embodied as non-transitory computer-readable storage media (NTCRSM) that may be suitable for use to store instructions (or data that creates the instructions) that cause an apparatus (such as any of the devices/components/systems described with regard toFIGS. 1-35 ), in response to execution of the instructions by the apparatus, to practice selected aspects of the present disclosure. As shown, NTCRSM may include a number ofprogramming instructions 3984, 3982 (or data to create the programming instructions).Programming instructions FIGS. 1-35 ), in response to execution of theprogramming instructions FIGS. 1-35 ). In various embodiments, theprogramming instructions computational logic 3980,instructions FIG. 39 . - In alternate embodiments, programming
instructions 3984, 3982 (or data to create theinstructions 3984, 3982) may be disposed on multiple NTCRSM. In alternate embodiments, programminginstructions 3984, 3982 (or data to create theinstructions 3984, 3982) may be disposed on computer-readable transitory storage media, such as, signals. Theprogramming instructions communication circuitry 3909 and/orNIC 3916 ofFIG. 39 ) utilizing any one of a number of transfer protocols (e.g., HTTP, and/or the like). - Any combination of one or more computer usable or computer readable media may be utilized as or instead of the NTCRSM. The computer-usable or computer-readable medium may be, for example but not limited to, one or more electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatuses, devices, or propagation media. For instance, the NTCRSM may be embodied by devices described for the
storage circuitry 3908 and/ormemory circuitry 3904 described previously. More specific examples (a non-exhaustive list) of a computer-readable medium may include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM, Flash memory, and/or the like), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device and/or optical disks, a transmission media such as those supporting the Internet or an intranet, a magnetic storage device, or any number of other hardware devices. In the context of the present disclosure, a computer-usable or computer-readable medium may be any medium that can contain, store, communicate, propagate, or transport the program (or data to create the program) for use by or in connection with the instruction execution system, apparatus, or device. The computer-usable medium may include a propagated data signal with the computer-usable program code (e.g., includingprogramming instructions 3984, 3982) or data to create the program code embodied therewith, either in baseband or as part of a carrier wave. The computer usable program code or data to create the program may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, and/or the like. - In various embodiments, the program code (or data to create the program code) described herein may be stored in one or more of a compressed format, an encrypted format, a fragmented format, a packaged format, and/or the like. Program code (e.g., programming
instructions 3984, 3982) or data to create the program code as described herein may require one or more of installation, modification, adaptation, updating, combining, supplementing, configuring, decryption, decompression, unpacking, distribution, reassignment, and/or the like. in order to make them directly readable and/or executable by a computing device and/or other machine. For example, the program code or data to create the program code may be stored in multiple parts, which are individually compressed, encrypted, and stored on separate computing devices, wherein the parts when decrypted, decompressed, and combined form a set of executable instructions that implement the program code or the data to create the program code, such as those described herein. In another example, the program code or data to create the program code may be stored in a state in which they may be read by a computer, but require addition of a library (e.g., a dynamic link library), a software development kit (SDK), an application programming interface (API), and/or the like. in order to execute the instructions on a particular computing device or other device. In another example, the program code or data to create the program code may need to be configured (e.g., settings stored, data input, network addresses recorded, and/or the like) before the program code or data to create the program code can be executed/used in whole or in part. In this example, the program code (or data to create the program code) may be unpacked, configured for proper execution, and stored in a first location with the configuration instructions located in a second location distinct from the first location. The configuration instructions can be initiated by an action, trigger, or instruction that is not co-located in storage or execution location with the instructions enabling the disclosed techniques. Accordingly, the disclosed program code or data to create the program code are intended to encompass such machine readable instructions and/or program(s) or data to create such machine readable instruction and/or programs regardless of the particular format or state of the machine readable instructions and/or program(s) when stored or otherwise at rest or in transit. - The computer program code for carrying out operations of the present disclosure, including for example, programming instructions 3984, 3982, computational logic 3980, instructions 3982, and/or instructions 3984, may be implemented as software code to be executed by one or more processors using any suitable computer language such as, for example, Python, PyTorch, NumPy, ArcPy, Ruby, Ruby on Rails, Scala, Smalltalk, Java™, C++, C #, “C”, Kotlin, Swift, Rust, Go (or “Golang”), ECMAScript, JavaScript, TypeScript, Jscript, ActionScript, Server-Side JavaScript (SSJS), PHP, Pearl, Lua, Torch/Lua with Just-In Time compiler (LuaJIT), Accelerated Mobile Pages Script (AMPscript), VBScript, JavaServer Pages (JSP), Active Server Pages (ASP), Node.js, ASP.NET, JAMscript, Hypertext Markup Language (HTML), extensible HTML (XHTML), Extensible Markup Language (XML), XML User Interface Language (XUL), Scalable Vector Graphics (SVG), RESTful API Modeling Language (RAML), wiki markup or Wikitext, Wireless Markup Language (WML), Java Script Object Notion (JSON), Apache® MessagePack™ Cascading Stylesheets (CSS), extensible stylesheet language (XSL), Mustache template language, Handlebars template language, Guide Template Language (GTL), Apache® Thrift, Abstract Syntax Notation One (ASN.1), Google® Protocol Buffers (protobuf), Bitcoin Script, EVM® bytecode, Solidity™, Vyper (Python derived), Bamboo, Lisp Like Language (LLL), Simplicity provided by Blockstream™, Rholang, Michelson, Counterfactual, Plasma, Plutus, Sophia, Salesforce® Apex®, Salesforce® Lightning®, and/or any other programming language, markup language, script, code, and/or the like. In some implementations, a suitable integrated development environment (IDE) or software development kit (SDK) may be used to develop the program code or software elements discussed herein such as, for example, Android® Studio™ IDE, Apple® iOS® SDK, or development tools including proprietary programming languages and/or development tools. Furthermore, some or all of the software components or functions described herein can utilize a suitable querying language to query and store information in one or more databases or data structures, such as, for example, Structure Query Language (SQL), noSQL, and/or other query languages. The software code can be stored as a computer- or processor-executable instructions or commands on a physical non-transitory computer-readable medium. The computer program code for carrying out operations of the present disclosure may also be written in any combination of the programming languages discussed herein. The program code may execute entirely on the
system 3900, partly on thesystem 3900 as a stand-alone software package, partly on thesystem 3900 and partly on a remote computer (e.g., remote system 3955), or entirely on the remote computer (e.g., remote system 3955). In the latter scenario, the remote computer may be connected to thesystem 3900 through any type of network (e.g., network 3950). - While only a
single computing device 3900 is shown, thecomputing device 3900 may include any collection of devices or circuitry that individually or jointly execute a set (or multiple sets) of instructions to perform any one or more of the operations discussed above.Computing device 3900 may be part of an integrated control system or system manager, or may be provided as a portable electronic device configurable or operable to interface with a networked system either locally or remotely via wireless transmission. - Some of the operations described previously may be implemented in software and other operations may be implemented in hardware. One or more of the operations, processes, or methods described herein may be performed by an apparatus, device, or system similar to those as described herein and with reference to the illustrated figures.
- Additional examples of the presently described embodiments include the following, non-limiting example implementations. Each of the non-limiting examples may stand on its own, or may be combined in any permutation or combination with any one or more of the other examples provided below or throughout the present disclosure.
- Example A01 includes a method comprising: identifying events from a domain; identifying a number of the events; identifying content associated with the events; identifying a topic; identifying a relevancy of the content to the topic; and generating a consumption score for the domain and topic based on the number of events and the relevancy of the content to the topic.
- Example A02 includes the method of example A01 and/or some other example(s) herein, further comprising identifying values in the consumption score reaching a threshold as a surge.
- Example A03 includes the method of examples A01-A02 and/or some other example(s) herein, further comprising identifying changes in the number of events over different time periods; and adjusting the consumption score based on the changes in the number of events.
- Example A04 includes the method of examples A01-A03 and/or some other example(s) herein, further comprising identifying a number of users generating the events for the different time periods; identifying changes in the number of users over the different time periods; and adjusting the consumption score based on the changes in the number of users.
- Example A05 includes the method of examples A01-A04 and/or some other example(s) herein, further comprising: comparing the relevancy with an overall relevancy for content accessed from multiple different domains; assigning a low initial value to the consumption score when the relevancy is below the overall relevancy; assigning a medium initial value to the consumption score when the relevancy is about the same as the overall relevancy; and assigning a high initial value to the consumption score when the relevancy is above the overall relevancy.
- Example A06 includes the method of examples A01-A05 and/or some other example(s) herein, further comprising: identifying types of user engagement with the content; and adjusting the relevancy of the content based on the types of user engagement with the content.
- Example A07 includes the method of examples A06 and/or some other example(s) herein, wherein the types of user engagement include page dwell times and types of page scrolling.
- Example A08 includes the method of examples A01-A07 and/or some other example(s) herein, further comprising: calculating a first number of users generating a first number of the events for a first time period; calculating a second number of users generating a second number of the events for a second time period; and decreasing the consumption score for the second time period when the second number of users is less than the first number of users.
- Example A09 includes the method of examples A01-A08 and/or some other example(s) herein, further comprising: identifying the events coming from an network address; identifying a company associated with the network address; and associating the company with the consumption score.
- Example A10 includes the method of example A10 and/or some other example(s) herein, further comprising: identifying a location for the company associated with the network address; and associating the location with the consumption score.
- Example A11 includes the method of examples A01-A10 and/or some other example(s) herein, further comprising: identifying a first group of events for a first time period; identifying a first group of content accessed during the first group of events; identifying a first relevancy of the first group of content to the topic; generating a first value for the consumption score for the first time period based on a number of the first group of events and the first relevancy; identifying a second group of events for a second time period after the first time period; identifying a second group of content accessed during the second group of events; identifying a second relevancy of the second group of content to the topic; and generating a second value for the consumption score for the second time period based on a number of the second group of events and the second relevancy.
- Example A12 includes the method of example A11 and/or some other example(s) herein, further comprising: adjusting the second value based on a change between the number of the first group of events and the number of the second group of events.
- Example A13 includes the method of examples A11-A12 and/or some other example(s) herein, further comprising: identifying a third group of events for a third time period after the second time period; identifying a third group of content accessed during the third group of events; identifying a third relevancy of the third group of content to the topic; generating a third value for the consumption score for the third time period based on a number of the third group of events and the third relevancy, and adjusting the third value based on changes between the number of the first group of events, the number of the second group of events, and the number of the third group of events.
- Example A14 includes a method comprising: identifying events associated with an entity; identifying content associated with the events; identifying a relevancy of the content to a topic for a time period; identifying a number of the events for the time period; and calculating a consumption score for the time period based on the relevancy of the content and the number of events.
- Example A15 includes the method of example A14 and/or some other example(s) herein, further comprising: identifying a number of users generating the events for the time period; and calculate the consumption score based on the number of users.
- Example A16 includes the method of examples A14-A16 and/or some other example(s) herein, further comprising: identifying the number of events associated with the entity and the topic over a series of time periods; and adjusting the consumption score based on changes in the number of events over the series of time periods.
- Example A17 includes the method of example A16 and/or some other example(s) herein, further comprising: identifying a number of users generating the events over the series of time periods; and adjusting the consumption score based on changes in the number of users over the series of time periods.
- Example A18 includes the method of examples A16-A17 and/or some other example(s) herein, further comprising: identifying a location associated with the entity; calculating a consumption score for the location based on the relevancy of the content and the number of events associated with the location.
- Example A19 includes the method of examples A16-A18 and/or some other example(s) herein, further comprising: identifying a surge in the consumption score when the consumption score reaches a threshold value.
- Example A20 includes the method of example A19 and/or some other example(s) herein, further comprising: mapping the surge to contacts associated with the entity.
- Example A21 includes the method of example A20 and/or some other example(s) herein, further comprising: sending a notification of the surge to a publisher.
- Example A22 includes the method of example A21 and/or some other example(s) herein, wherein the notification of the surge is configured to trigger the publisher to send information associated with the topic to contacts associated with the entity.
- Example A23 includes the method of example A22 and/or some other example(s) herein, wherein the notification of the surge is configured to trigger the publisher to send the information to the contacts having a job title associated with the topic.
- Example A24 includes a method comprising: identifying or determining events associated with accessing content; identifying or determining types of user engagement with the content; calculating an engagement score for the content based on the types of user engagement; calculating a relevancy of the content to a topic; and adjusting the relevancy of the content based on the engagement score.
- Example A25 includes the method of example A24 and/or some other example(s) herein, further comprising: calculating the engagement score based on user content dwell times.
- Example A26 includes the method of example A25 and/or some other example(s) herein, further comprising: calculating the engagement score based on content scroll depths.
- Example A27 includes the method of examples A24-A26 and/or some other example(s) herein, further comprising: calculating the engagement score based on the content scroll speeds.
- Example A28 includes the method of examples A24-A27 and/or some other example(s) herein, further comprising: calculating a consumption score based on the relevancy.
- Example B01 includes a method of operating a content consumption monitor (CCM), the method comprising: obtaining data packets including information about session events from computer devices that access information objects, the information objects including tags that monitor and capture the session events; determining a domain name for an organization associated with at least one network address indicated by at least one of the session events; identifying or determining, for each of multiple time periods, a group of the session events that include the domain name of the organization or one or more network addresses associated with the organization; identifying or determining, for each of the multiple time periods, a set of information objects accessed by members of the organization as indicated by the group of the session events; identifying or determining a plurality of topics based on one or more words in the set of information objects; determining a relevancy score of the set of information objects to each topic of the plurality of topics; determining a number of unique users of the organization generating the session events for each of the time periods, the unique users being users of one or more of the computer devices; generating consumption scores for the organization for each of the multiple time periods based on the number of session events generated from the organization, the number of unique users of the organization, and the average relevancy score for each topic; and determining a surge in the consumption scores for the organization based on changes in the consumption scores for the organization over the multiple time periods.
- Example B01.1 includes the method of example B01 and/or some other examples herein, wherein the tags include program code that cause computer devices to monitor and capture the session events generated by the computer devices and send the data packets including the captured session events to the CCM.
- Example B01.2 includes the method of examples B01-B01.1 and/or some other examples herein, wherein each of the session events at least identifies an accessed information object, an event type identifier that identifies an action or activity associated with of the accessed information object, and a network address from which the accessed information object was accessed.
- Example B01.3 includes the method of examples B01-B01.2 and/or some other examples herein, wherein the information objects are webpages or content embedded in webpages.
- Example B01.4 includes the method of examples B01-B01.3 and/or some other examples herein, wherein determining the domain name comprises: identifying or determining the domain name from a database.
- Example B01.5 includes the method of examples B01-B01.4 and/or some other examples herein, wherein the relevancy score is an average relevancy score, and the average relevancy score is an average of a plurality of relevancy scores of the set of information objects to each topic, each of the plurality of relevancy scores being calculated based in part on a number of words in the set of information objects that are associated with each topic and event types performed on the set of information objects as indicated by the group of the session events.
- Example B02 includes the method of examples B01-B01.5 and/or some other examples herein, wherein, to identify the surge, execution of the instructions is to cause the hardware processor to: identify values in the consumption score reaching a threshold to be the surge.
- Example B03 includes the method of examples B01-B02 and/or some other examples herein, further comprising: identifying or determining changes in the number of session events over different time periods, and adjust the consumption score based on the changes in the number of session events; and/or identifying or determining a number of users generating the session events for the different time periods, identify changes in the number of users over the different time periods, and adjust the consumption score based on the changes in the number of users.
- Example B04 includes the method of examples B01-B03 and/or some other examples herein, further comprising: comparing the relevancy score for the organization with an overall relevancy score for information objects accessed from multiple computer devices not associated with the organization; assigning an initial value of “low” to the consumption score for the organization when the average relevancy score is below the overall relevancy; assigning an initial value of “medium” to the consumption score for the organization when the average relevancy score is about a same value as the overall relevancy; and assigning an initial value of “high” to the consumption score when the average relevancy score is above the overall relevancy score.
- Example B05 includes the method of examples B01-B05 and/or some other examples herein, further comprising: identifying or determining types of user engagement with the information objects based on information included in the session events; and adjusting the relevancy of the information object based on the types of user engagement with the content.
- Example B06 includes the method of example B05 and/or some other examples herein, wherein the types of user engagement include one or more of information object dwell times, scrolling data (e.g., type of scrolling, scroll depth, scroll velocity, scroll bar scrolls), mouse data (e.g., mouse scrolls, mouse clicks, mouse movements, and/or the like), keyboard data (e.g., key presses, and/or the like), touch data (e.g., touch gestures, and/or the like), eye tracking data (e.g., gaze locations, gaze times, gaze regions of interest, eye movement frequency, speed, orientations, and/or the like), variance in content consumption over a period of time, and tab selections that switch between information objects.
- Example B07 includes the method of examples B01-B06 and/or some other examples herein, further comprising: calculating a first number of users generating a first number of the session events for a first time period; calculating a second number of users generating a second number of the session events for a second time period; and decreasing the consumption score for the second time period when the second number of users is less than the first number of users.
- Example B08 includes the method of examples B01-B07 and/or some other examples herein, further comprising: identifying or determining a first group of the session events for a first time period coming from a same network address; calculating a first consumption score based on an average relevancy score for the first group of the session events with each topic; identifying or determining a second group of the session events for a second time period coming from the same network address; calculate a second consumption score based on an average relevancy score for the second group of the session events with each topic; and calculating a surge score for the organization based on a change between the first consumption score and the second consumption score.
- Example B09 includes the method of examples B01-B08 and/or some other examples herein, further comprising: identifying or determining the organization based on one or more network addresses associated with the organization; identifying or determining a location for the organization based on the one or more network addresses; and associating the identified location with the consumption score.
- Example B10 includes the method of examples B01-B09 and/or some other examples herein, wherein the group of the session events includes a first group of the session events and a second group of the session events, and execution of the instructions is to cause the hardware processor to: identifying or determining the first group of the session events for a first time period; identifying or determining a first group of information objects accessed during the first group of the session events; identifying or determining a first relevancy score of the first group of information objects to each topic; generate a first value for the consumption score for the first time period based on a number of the first group of session events and the first relevancy score; identifying or determining the second group of the session events for a second time period after the first time period; identifying or determining a second group of information objects accessed during the second group of the session events; identifying or determining a second relevancy score of the second group of information objects to each topic; identifying or determining a rate that the second relevancy score increases or decreases from the first relevancy score; and generating a second value for the consumption score for the second time period based on a number of the second group of the session events and the rate that the second relevancy score increases or decreases from the first relevancy score.
- Example B11 includes the method of example B10 and/or some other examples herein, further comprising: adjusting the second value based on a change between the number of the first group of the session events and the number of the second group of the session events.
- Example B12 includes the method of example B11 and/or some other examples herein, further comprising: identifying or determining a third group of session events for a third time period after the second time period; identifying or determining a third group of information object accessed during the third group of session events; identifying or determining a third relevancy score of the third group of information object to each topic; generating a third value for the consumption score for the third time period based on a number of the third group of session events and the third relevancy, and adjusting the third value based on changes between the number of the first group of session events, the number of the second group of session events, and the number of the third group of session events.
- Example B13 includes a method of operating a content consumption monitor (CCM), the method comprising: obtaining data packets including session events from computer devices that access information objects, the information objects including webpages, content included in webpages, or applications, the information objects including tags, the tags including executable instructions which cause the computer devices to monitor and capture the session events generated by the computer devices and send the packets including the captured session events to the CCM, each of the session events identifying an accessed information object, an event type identifier that identifies an action or activity associated with of the accessed information object, and a network address from which the information object was accessed; accessing a database to identify a domain name for an organization associated with at least one network address indicated by at least one of the session events; identifying or determining a group of the session events for each time period of a plurality of time periods that include the domain name of the organization or one or more network addresses associated with the organization; identifying or determining a set of information objects accessed by members of the organization indicated by the group of the session events for each time period; identifying or determining an average relevancy score of the set of information objects to each topic of a plurality of topics for the organization for each time period, the average relevancy score being based on a plurality of relevancy scores calculated for the set of information objects to each topic, each of the plurality of relevancy scores being based in part on a number of words in the set of information objects that are associated with each topic and event types indicated by the group of the session events as being performed on the set of information objects; identifying or determining a number of unique users of the organization generating the session events for each time period, the unique users being users of one or more of the computer devices; identifying or determining a consumption score for the organization for each time period based on the number of session events generated by the organization, the number of unique users of the organization, and the average relevancy score of the set of information objects to each topic for the organization for each time period; identifying or determining a surge in consumption scores for the organization based on changes in the consumption scores for the organization over the plurality of time periods; and providing, to a display device, data indicating the consumption scores and the surge in consumption scores for display in a user interface displayed by the display device.
- Example B14 includes the method of example B13 and/or some other examples herein, further comprising: identifying or determining a number of users generating the session events for the time period; and calculate the consumption score further based on the number of users.
- Example B15 includes the method of examples B13-B14 and/or some other examples herein, further comprising: identifying or determining a number of users generating the session events over the plurality of time periods; and adjust the consumption score based on changes in the number of users over the plurality of time periods.
- Example B16 includes the method of examples B13-B15 and/or some other examples herein, further comprising: identifying or determining a location associated with the organization; and calculating a consumption score for the location based on the plurality of relevancy scores of the information objects and a number of session events associated with the location.
- Example B17 includes the method of examples B13-B16 and/or some other examples herein, further comprising: identifying or determining a surge in the consumption score when the consumption score reaches a threshold value.
- Example B18 includes the method of example B17 and/or some other examples herein, further comprising: mapping the surge to contacts associated with the organization.
- Example B19 includes the method of example B18 and/or some other examples herein, further comprising: sending a notification of the surge to a service provider.
- Example B20 includes the method of example B19 and/or some other examples herein, wherein the notification of the surge is configured to trigger the service provider to send information associated with at least one topic of the plurality of topics to contacts associated with the organization.
- Example B21 includes the method of example B20 and/or some other examples herein, wherein the notification of the surge is configured to trigger the publisher to send the information to the contacts having a job title associated with at least one topic of the plurality of topics.
- Example C01 includes the method of examples A01-A28, B01-B21, and/or some other example(s) herein, wherein the events comprise one or more of database session events, work unit events, client or browser session events, server session events, remote session events (e.g., remote desktop session events, and/or the like), network session events, web session events, HTTP session events, telnet remote login session events, SIP session events, Transmission Control Protocol (TCP) session events, User Datagram Protocol (UDP) session events, cellular network events, and/or other events of other session types such as those discussed herein.
- Example C02 includes the method of examples A01-A28, B01-B21, C01, and/or some other example(s) herein, wherein the network addresses is/are internet protocol (IP) addresses, telephone numbers in a public switched telephone number, a cellular network addresses, internet packet exchange (IPX) addresses, X.25 addresses, X.21 addresses, Transmission Control Protocol (TCP) or User Datagram Protocol (UDP) port numbers, media access control (MAC) addresses, Electronic Product Codes (EPCs), Bluetooth hardware device addresses, a Universal Resource Locators (URLs), and/or email addresses.
- Example C03 includes the method of examples A01-A28, B01-B21, C01-C02, and/or some other example(s) herein, wherein any one or more of examples A01-A28 are combinable with any one or more of examples B01-B21 and/or some other example(s) herein.
- Example D01 includes a method for analyzing events associated with a network address, the method comprising: receiving events associated with users accessing content from the network address; generating features from the events, the features identifying how the users access the content at a physical location associated with the network address; and identifying a type of entity associated with the network address based on the features.
- Example D02 includes the method of example D01 and/or some other example(s) herein, further comprising: identifying the type of entity associated with the network address based on when the users access the content.
- Example D03 includes the method of examples D01-D02 and/or some other example(s) herein, further comprising: identifying a time range associated with working hours; identifying a ratio of the events generated during the working hours vs. non-working hours; and identifying the type of entity associated with the network address based on the ratio.
- Example D04 includes the method of example D03 and/or some other example(s) herein, further comprising: identifying the type of entity as a private org location when the ratio of events during business hours vs. non-business hours is above a threshold; and identifying the type of entity as a public org location or a non-org location when the ratio of events during business hours vs. non-business hours is below the threshold.
- Example D05 includes the method of examples D01-D04 and/or some other example(s) herein, further comprising: identifying the type of entity associated with the network address based on a number of the events generated during a work week vs. a number of the events generated during a weekend.
- Example D06 includes the method of examples D01-D05 and/or some other example(s) herein, further comprising: identifying the type of entity associated with the network address based on one of the features identifying durations of time the users are accessing the content at the network address.
- Example D07 includes the method of examples D01-D06 and/or some other example(s) herein, further comprising: identifying the type of entity associated with the network address based on one of the features identifying types of computing devices used for accessing the content at the network address.
- Example D08 includes the method of example D07 and/or some other example(s) herein, further comprising: identifying the type of entity as a private org location when a ratio of computing devices including laptop and personal computers (PCs) compared with smart phones is above a threshold; and identifying the type of entity as a public org location when the ratio is below the threshold.
- Example D09 includes the method of examples D01-D08 and/or some other example(s) herein, further comprising: identifying a domain for the network address when the type of entity associated with the network address is identified as a private org location; and generating a consumption score for the domain when the type of entity associated with the network address is identified as a private org location.
- Example D09.5 includes the method of examples D01-D09 and/or some other example(s) herein, wherein any one or more of examples D01-D09 are combinable with any one or more of the following examples and/or some other example(s) herein.
- Example D10 includes a method comprising: receiving events identifying how users access content; identifying network addresses for the events, the network addresses associated with locations where the users access the content; using time stamps in the events to determine when users access the content at the locations associated with the network addresses; and identifying types of establishments associated with the network addresses based on when the users access the content at the locations associated with the network addresses.
- Example D1l includes the method of example D10 and/or some other example(s) herein, further comprising: identifying network addresses where a particular percentage of the events occur during working business hours; and identifying the types of establishments for the identified network addresses as private org locations.
- Example D12 includes the method of examples D10-D11 and/or some other example(s) herein, further comprising: identifying network addresses where the users access the content over a particular duration of time; and identifying the types of establishments for the identified network addresses as private org locations.
- Example D13 includes the method of examples D10-D12 and/or some other example(s) herein, further comprising: identifying from the events types of computing devices used for accessing the content; identifying network addresses where a particular percentage of the computing devices are smart devices; and identifying the types of establishments for the identified network addresses as public org locations.
- Example D14 includes the method of examples D10-D13 and/or some other example(s) herein, further comprising: identifying the network addresses where a number of the events during a specified time period is below a threshold; and identifying the types of establishments for the identified network addresses as non-org locations.
- Example D15 includes the method of examples D10-D14 and/or some other example(s) herein, further comprising: identifying the network addresses associated with private org locations; identifying companies associated with the identified network addresses; and generating consumption scores from the events with the identified network addresses associated with the companies.
- Example D15.5 includes the method of examples D10-D15 and/or some other example(s) herein, wherein any one or more of examples D10-D15 are combinable with any one or more of examples D01-D09 and/or some other example(s) herein.
- Example D16 includes a method, comprising: identifying events associated with an entity; identifying content associated with the events; identifying network addresses associated with the events; identifying types of locations associated with the network addresses based on how users at the network addresses access the content; filtering the network addresses based on the types of locations associated with the network addresses; and calculating consumption scores for the filtered network addresses
- Example D17 includes the method of example D16 and/or some other example(s), further comprising: identify some of the network addresses as private org locations based on when the users access the content at the network addresses.
- Example D18 includes the method of examples D16-D17 and/or some other example(s), further comprising: identifying the number of events associated with the network addresses over a series of time periods; and adjusting the consumption scores based on changes in the number of events over the series of time periods.
- Example D19 includes the method of example D18 and/or some other example(s), further comprising: identifying a number of users generating the events over the series of time periods; and adjusting the consumption scores based on changes in the number of users over the series of time periods.
- Example D20 includes the method of examples D16-D19 and/or some other example(s), further comprising: identifying service providers; identifying some of the content provided by the service providers; generating relevancy values of the content to a topic; weighting the relevancy values based on the content being provided by the service providers; and generating the consumption scores for the filtered network addresses based on the weighted relevancy values.
- Example D21 includes the method of examples D16-D20 and/or some other example(s) herein, wherein any one or more of examples D16-D20 are combinable with any one or more of examples D01-D09, D10-D15.5, and/or some other example(s) herein.
- Example E01 includes a method for analyzing network events associated with a network address, the method comprising: obtaining network events including information about users accessing information objects from the network address; generating a set of machine learning (ML) features from the information about the users accessing the information objects; and determining an organization (org) type associated with the network address based on the set of ML features.
- Example E02 includes the method of example E01 and/or some other example(s) herein, further comprising: identifying the org type associated with the network address based on when and how the users access the information objects.
- Example E03 includes the method of examples E01-E02 and/or some other example(s) herein, wherein generating the set of ML features comprises: identifying a time range associated with operation of each org type of a set of org types; and determine, for each org type, a ratio of a number of the network events generated within the identified time range to a number of the network events generated outside of the identified time range, wherein the ratio is generated to be at least one ML feature of the set of ML features.
- Example E04 includes the method of example E03 and/or some other example(s) herein, further comprising: determining the org type to be a private org location when the ratio is above a threshold; and determining the org type to be a public org location or a non-org location when the ratio is below the threshold.
- Example E05 includes the method of examples E01-E04 and/or some other example(s) herein, wherein generating the set of ML features comprises: identifying one or more days associated with operation of each org type of a set of org types; and determining, for each org type, a ratio of a number of the network events generated on the one or more days to a number of the network events generated on other days different than the one or more days, wherein the ratio is generated to be at least one ML feature of the set of ML features.
- Example E06 includes the method of example E01-E05 and/or some other example(s) herein, wherein at least one ML feature of the set of ML features indicates a duration of the user accesses of the information objects, and the method further comprises: identifying the org type associated with the network address based on the at least one ML feature indicating the duration of the user accesses.
- Example E07 includes the method of example E01-E06 and/or some other example(s) herein, wherein at least one ML feature of the set of ML features indicates device types used to access the information objects from the network address, and the method further comprises: identifying the org type based on the at least one ML feature indicating the device types.
- Example E08 includes the method of example E07 and/or some other example(s) herein, further comprising: determining a number of laptop computers used to access the information objects from the network address, a number of desktop computers used to access the information objects from the network address, and a number of mobile devices used to access the information objects from the network address; determining the org type to be a private org location when a ratio of the number of laptop computers and the number of desktop computers to the number of mobile devices is at or above a threshold; and determining the org type to be a public org location when the ratio is below the threshold.
- Example E09 includes the method of example E01-E08 and/or some other example(s) herein, further comprising: identifying a domain for the network address when the org type associated with the network address is identified as a private org location; and generating a consumption score for the domain when the org type associated with the network address is identified as a private org location.
- Example E10 includes a method for operating a network address classification system (NACS), the method comprising: operating a feature generator to generate a set of machine learning (ML) features based on aspects of user accesses to information objects indicated by obtained network session events, the network session events indicating network addresses associated with locations from which the information objects are accessed by the users; and operating an entity classifier to determine organization (org) types associated with the locations from which the information objects are accessed based on the ML features.
- Example E11 includes the method of example E10 and/or some other example(s) herein, wherein the network session events include timestamps indicating a time at which the users accessed the information objects, and the method comprises: operating the feature generator to generate the set of ML features to include one or more time-based features, the one or more time-based features indicating a time of day when individual users accessed the information objects at the respective locations based on the timestamps and percentages of the events that occur at different time periods at the respective locations; and operating the entity classifier to determine the org types associated with the network addresses based on the one or more time-based features.
- Example E12 includes the method of example E11 and/or some other example(s) herein, further comprising: operating the entity classifier to determine the org type associated with the network addresses to be private org locations when the one or more time-based features indicate that some or all of the information objects were accessed outside of a specified time period.
- Example E13 includes the method of examples E10-E12 and/or some other example(s) herein, further comprising: operating the feature generator to generate the set of ML features to include one or more duration-based features, the one or more duration-based features indicating an average amount of time the individual users access the information objects at the respective locations; and operating the entity classifier to determine the org types associated with the network addresses based on the one or more duration-based features.
- Example E14 includes the method of example E13 and/or some other example(s) herein, further comprising: operating the entity classifier to determine the org type associated with the network addresses to be private org locations when the one or more duration-based features indicate that some or all of the information objects were accessed for a threshold amount of time.
- Example E15 includes the method of examples E10-E14 and/or some other example(s) herein, further comprising: operating the feature generator to generate the set of ML features to include one or more event-based features, the one or more event-based features indicating an average amount of events generated by individual users at the respective locations; and operating the entity classifier to determine the org types associated with the network addresses based on the one or more event-based features.
- Example E16 includes the method of examples E10-E15 and/or some other example(s) herein, further comprising: operating the feature generator to generate the set of ML features to include one or more device-based features, the one or more device-based features indicating types of computing devices used for accessing the information objects at the respective locations; and operating the entity classifier to determine the org types associated with the network addresses based on the one or more device-based features.
- Example E17 includes the method of example E16 and/or some other example(s) herein, further comprising: operating the entity classifier to determine the org type associated with the network addresses to be private org locations when the one or more device-based features indicate that a majority of computing devices that accessed the information objects are laptop computers or desktop computers; and operating the entity classifier to determine the org type associated with the network addresses to be public org locations when the one or more device-based features indicate that a majority of computing devices that accessed the information objects are tablet computers or mobile devices.
- Example E18 includes the method of examples E10-E17 and/or some other example(s) herein, further comprising: operating the entity classifier to identify some of the network addresses as being associated with private org locations or not private org locations.
- Example E19 includes the method of example E18 and/or some other example(s) herein, wherein operating the entity classifier to determine the org types comprises: operating the entity classifier to use a logistic regression model to determine the org types.
- Example E20 includes the method of examples E10-E19 and/or some other example(s) herein, further comprising: operating a content consumption monitor to: filter the network addresses based on the types of locations associated with the network addresses; calculate consumption scores for the filtered network addresses; identify the number of events associated with the network addresses over a series of time periods; adjust the consumption scores based on changes in the number of events over the series of time periods; and determine surge scores based on an increase in respective consumption scores within a predefined period of time.
- Example E21 includes the method of examples E10-E20 and/or some other example(s) herein, wherein any one or more of examples E10-E20 are combinable with any one or more of examples D01-D09, D10-D15.5, D16-D21, E01-E09, and/or some other example(s) herein.
- Example E22 includes the method of examples D01-D21, E01-E21, and/or some other example(s) herein, wherein the network addresses is/are internet protocol (IP) addresses, telephone numbers in a public switched telephone number, a cellular network addresses, internet packet exchange (IPX) addresses, X.25 addresses, X.21 addresses, Transmission Control Protocol (TCP) or User Datagram Protocol (UDP) port numbers, media access control (MAC) addresses, Electronic Product Codes (EPCs), Bluetooth hardware device addresses, a Universal Resource Locators (URLs), and/or email addresses.
- Example F01 includes a method comprising: identifying a first set of events generated by an entity from a hostname resource; identifying a second set of events generated by the entity from the hostname resource and from other third party websites; and generating a web resource interest score indicating an interest level of the entity in the hostname resource based on a comparison of the first set of events with the second set of events.
- Example F02 includes the method of example F01 and/or some other example(s) herein, further comprising: generating different web resource interest ratios based on a comparison of the first set of events with the second set of events; and summing up the web resource interest ratios to generate the web resource interest score.
- Example F03 includes the method of examples F01-F02 and/or some other example(s) herein, further comprising: generating an event count ratio based on a number of the events generated by the entity from the hostname resource compared to the number of events generated by the entity from the hostname resource and the other websites; and generating the web resource interest score based on the event count ratio.
- Example F04 includes the method of examples F01-F03 and/or some other example(s) herein, further comprising: generating a unique user ratio based on a number of different users for the entity generating events from the hostname resource compared with a number of different users for the entity generating events from the hostname resource and the other websites; and generating the web resource interest score based on the unique user ratio.
- Example F05 includes the method of examples F01-F04 and/or some other example(s) herein, further comprising: generating an engagement score ratio based on engagement of the entity with content from the hostname resource compared with engagement of the entity with content from the hostname resource and the other websites; and generating the web resource interest score based on the engagement score ratio.
- Example F06 includes the method of examples F01-F05 and/or some other example(s) herein, further comprising: generating a first series of web resource interest scores from the events generated over a series of baseline time periods; generating a baseline distribution from the first series of web resource interest scores; comparing a second series of web resource interest scores generated over a subsequent series of current time periods with the baseline distribution; and identifying an entity surge when any of the second series of web resource interest scores are outside of a threshold range of the baseline distribution.
- Example F07 includes the method of examples F01-F06 and/or some other example(s) herein, further comprising: receiving a resource cluster identifying multiple hostname resources; generating web resource interest scores for each of the hostname resources in the resource cluster; and generating a resource cluster interest score based on the web resource interest scores for the resource cluster.
- Example F08 includes the method of example F07 and/or some other example(s) herein, further comprising: receiving a resource cluster weighting vector including weighting values for each of the hostname resources; and applying the resource cluster weighting vector to the web resource interest scores associated with the same hostname resources to generate the resource cluster interest score.
- Example F09 includes the method of examples F07-F08 and/or some other example(s) herein, further comprising: receiving a topic cluster including multiple topics; generating consumption scores for each of the topics; generating a topic cluster interest score based on the consumption scores for each of the topics; and combining the topic cluster interest score with the resource cluster interest score to generate a weighted intent score.
- Example F10 includes the method of example F09 and/or some other example(s) herein, further comprising: generating the consumption scores based on events generated by the entity from the hostname resource and events generated by the entity from other third party websites.
- Example F11 includes the method of examples F09-F10 and/or some other example(s) herein, further comprising: receiving a topic cluster weighting vector including weighting values for each of the topics; and applying the topic cluster weighting vector to the consumption scores associated with the same topics to generate the topic cluster interest score.
- Example F12 includes the method of examples F09-F11 and/or some other example(s) herein, wherein the weighted intent score comprises:
-
- wherein STCI is the topic cluster interest score, SWCI is the resource cluster interest score, αTCI is a topic cluster interest threshold, and αWCI is a resource cluster interest threshold.
- Example F13 includes a method comprising: identifying events generated by an entity from one or more hostname resources and from other third party websites; and generating a resource cluster interest score based on the events indicating an interest level of the entity in the one or more hostname resources; identifying a topic cluster including multiple topics; generating a topic cluster interest score based on the events indicating an interest level of the entity in the topics; and generating a weighted intent score based on the resource cluster interest score and the topic cluster interest score.
- Example F14 includes the method of example F13 and/or some other example(s) herein, further comprising: generating web resource interest ratios based on the events generated by the entity while accessing the hostname resources compared with the events generated by the entity while accessing the other third party websites; and combining the web resource interest ratios for the hostname resources to generate the resource cluster interest score.
- Example F15 includes the method of example F14 and/or some other example(s) herein, further comprising: generating event count ratios between a number of the events generated by the entity from the hostname resources compared with the number of events generated by the entity from the hostname resources and the other third party websites; and generating the resource cluster interest score based on the event count ratios.
- Example F16 includes the method of example F15 and/or some other example(s) herein, further comprising: generating unique user ratios between a number of different users for the entity generating events from the hostname resources and a number of different users for the entity generating events from the hostname resources and the other third party websites; and generating the resource cluster interest score based on the event count ratios and the unique user ratios.
- Example F17 includes the method of examples F15-F16 and/or some other example(s) herein, further comprising: generating engagement score ratios between engagement scores of the entity with content on the hostname resources and engagement scores of the entity with the hostname resources and the other third party websites; and generating the resource cluster interest score based on the event count ratios, the unique user ratios, and the engagement score ratios.
- Example F18 includes the method of examples F13-F17 and/or some other example(s) herein, further comprising: generating a first series of resource cluster interest scores from the events generated over a series of baseline time periods; generating a baseline distribution from the first series of resource cluster interest scores; comparing a second series of resource cluster interest scores generated over a subsequent series of current time periods with the baseline distribution; and identifying an entity surge when any of the second series of web resource interest scores are outside of a threshold range of the baseline distribution.
- Example F19 includes the method of examples F13-F18 and/or some other example(s) herein, further comprising: generating consumption scores for the entity for each of the topics; and generating the topic cluster interest score based on the consumption scores for each of the topics.
- Example F20 includes the method of example F19 and/or some other example(s) herein, further comprising: identifying content associated with the events accessed by the entity; identifying a relevancy of the content to the topics; identifying a number of the events generated by the entity; and generating the consumption scores for the entity based on the number of the events and the relevancy of the content to the topics.
- Example F21 includes the method of examples F19-F20 and/or some other example(s) herein, further comprising: receiving a topic cluster weighting vector including weighting values for each of the topics; and applying the topic cluster weighting vector to the consumption scores associated with the same topics to generate the topic cluster interest score.
- Example F22 includes the method of examples F13-F21 and/or some other example(s) herein, wherein the weighted intent score comprises:
-
- wherein STCI is the topic cluster interest score, SWCI is the resource cluster interest score, αTCI is a topic cluster interest threshold, and αWCI is a resource cluster interest threshold.
- Example F23 includes the method of examples F13-F22 and/or some other example(s) herein, further comprising: receiving raw events that include universal resource locators (URLs) and network addresses; converting the URLs into hostnames; converting the network addresses into entities; identifying the events that include the same hostname and entity; and generating the resource cluster interest score based on the events generated by the same entity from the same hostname resource compared with the events generated by the same entity from the hostname resource and the other third party websites.
- Example G01 includes a method comprising: obtaining a first set of network events generated by client devices, each network event of the first set of network events including a first network address of an information object and a second network address of a device that accessed the information object; generating a second set of network events by replacement of the first network address with a hostname resource and replacement of the second network address with a predicted entity; generating one or more machine learning (ML) features from the second set of network addresses; and generating a resource interest score based on the one or more ML features, the resource interest score indicating an interest level of the entity in the hostname resource.
- Example G02 includes the method of example G01 and/or some other example(s) herein, further comprising: generating the one or more ML features based on a comparison of the first set of events with the second set of events; and determining web resource interest score based on a combination of the one or more ML features.
- Example G03 includes the method of example G02 and/or some other example(s) herein, wherein the one or more ML features include an event count feature based on a number of the network events generated by the entity indicating access to the hostname resource compared to a total number of network events generated by the entity.
- Example G04 includes the method of examples G02-G03 and/or some other example(s) herein, wherein the one or more ML features include a unique user feature based on a number of unique users associated with the entity that generate the first set of network events indicating the hostname resource compared with a total number of different users associated with the entity generating the first set of network events.
- Example G05 includes the method of examples G02-G04 and/or some other example(s) herein, wherein the one or more ML features include an engagement score feature based on engagement metrics of the entity with information objects associated with the hostname resource compared with engagement metrics of the entity with all information objects indicated by the first set of network events.
- Example G06 includes the method of examples G01-G05 and/or some other example(s) herein, further comprising: generating a first series of web resource interest scores from a first set of ML features of the one or more ML features generated over a series of baseline time periods; generating a baseline distribution from the first series of web resource interest scores; generating a second series of web resource interest scores from a second set of ML features of the one or more ML features generated over a subsequent series of current time periods; and identifying an entity surge when any of the second series of web resource interest scores are outside of a threshold range of the baseline distribution.
- Example G07 includes the method of examples G01-G06 and/or some other example(s) herein, further comprising: determining a resource cluster, the resource cluster including a plurality of hostname resources; generating web resource interest scores for each hostname resource of the plurality of hostname resources; and generating a resource cluster interest score based on the web resource interest scores for each hostname resource.
- Example G08 includes the method of example G07 and/or some other example(s) herein, further comprising: determining a resource cluster weighting vector including weighting values for each hostname resource; and applying the resource cluster weighting vector to the web resource interest scores for each hostname resource.
- Example G09 includes the method of examples G07-G08 and/or some other example(s) herein, further comprising: determining a topic cluster, the topic cluster including a plurality of topics; generating consumption scores for each topic of the plurality of topics based on network events generated by the entity from the hostname resource and events generated by the entity from resources different than the hostname resource; generating a topic cluster interest score based on the consumption scores of each topic; and combining the topic cluster interest score with the resource cluster interest score to generate a weighted intent score.
- Example G10 includes the method of example G09 and/or some other example(s) herein, further comprising: determining a topic cluster weighting vector including weighting values for each topic; and applying the topic cluster weighting vector to the consumption scores associated with same topics of the plurality of topics.
- Example G11 includes the method of examples G09-G10 and/or some other example(s) herein, further comprising: determining the weighted intent score according to:
-
- wherein STCI is the topic cluster interest score, SWCI is the resource cluster interest score, αTCI is a topic cluster interest threshold, and αWCI is a resource cluster interest threshold.
- Example G12 includes a method for operating a resource interest detector, the method comprising: operating a consumption event transform to convert a set of raw network events into a set of hostname events, each hostname event of the set of hostname events indicating a hostname resource and a predicted entity from which the hostname resource was accessed; operating a resource interest feature (RIF) generator to generate a set of RIFs from the set of hostname events for a time period, the set of RIFs indicating an interest level of the entity in the hostname resources during the time period; operating an interest score generator (ISG) to generate a resource interest score vector for the time period based on a combination of the set of RIFs, the resource interest score vector including a resource interest score for each hostname resource indicated by the set of hostname events; operating a resource cluster ISG (RCISG) to calculate a resource cluster interest score based on the resource interest scores of the resource interest score vector; operating a topic cluster interest score generator (TCISG) to calculate topic cluster interest score based on a set of topic interest scores of a topic interest score vector, the set of topic interest scores being topic interest scores generated for each hostname resource; and operating a weighted intent score generator (WISG) to generate weighted intent score based on a combination of resource cluster interest score and the topic cluster interest score.
- Example G13 includes the method of example G12 and/or some other example(s) herein, wherein the consumption event transform comprises an entity predictor and a hostname extractor, and the method further comprises: operating the entity predictor to predict the entity associated with the set of raw network events generated by one or more client devices that accessed one or more informations objects associated with one or more hostname resources; and operating the hostname extractor to extract the one or more hostname resources from the set of raw network events.
- Example G14 includes the method of examples G12-G13 and/or some other example(s) herein, wherein the hostname resource indicated by each hostname event is based on a uniform resource locator (URL) included in a corresponding raw network event of the set of raw network events, and the predicted entity indicated by each hostname event is based on a network address included in the corresponding raw network event of the set of raw network events.
- Example G15 includes the method of examples G12-G14 and/or some other example(s) herein, further comprising: operating the RCISG to calculate the resource cluster interest score further based on a resource cluster weighting vector, the resource cluster weighting vector including a sets of weights to be applied to resource interest scores of the resource interest score vector.
- Example G16 includes the method of example G15 and/or some other example(s) herein, further comprising operating the RCISG to calculate the resource cluster interest score by computing a magnitude of a vector that is a result of an entrywise product of the resource interest score vector and the resource cluster weighting vector.
- Example G17 includes the method of examples G12-G16 and/or some other example(s) herein, further comprising: operating the TCISG to calculate the topic cluster interest score further based on a topic cluster weighting vector, the topic cluster weighting vector including a sets of weights to be applied to consumption scores included in the topic interest score vector.
- Example G18 includes the method of example G17 and/or some other example(s) herein, further comprising: operating the TCISG to calculate the topic cluster interest score by computing a magnitude of a vector that is a result of an entrywise product of the topic interest score vector and the topic cluster weighting vector.
- Example G19 includes the method of examples G12-G18 and/or some other example(s) herein, further comprising: operating the WISG to generate the weighted intent score further based on a topic cluster interest threshold and a resource cluster interest threshold, wherein the topic cluster interest threshold and the resource cluster interest threshold are derived based on baseline distributions or may be based on a priori data.
- Example G20 includes the method of example G19 and/or some other example(s) herein, further comprising: operating the WISG to detect a surge signal in the weighted intent score when the topic cluster interest score exceeds the topic cluster interest threshold or when the resource cluster interest score exceeds the resource cluster interest threshold.
- Example G21 includes the method of examples F01-F23, G01-G20, and/or some other example(s) herein, wherein any one or more of examples F01-F23 are combinable with any one or more of examples G01-G20 and/or some other example(s) herein.
- Example G22 includes the method of examples F01-F23, G01-G20, and/or some other example(s) herein, wherein the network addresses is/are internet protocol (IP) addresses, telephone numbers in a public switched telephone number, a cellular network addresses, internet packet exchange (IPX) addresses, X.25 addresses, X.21 addresses, Transmission Control Protocol (TCP) or User Datagram Protocol (UDP) port numbers, media access control (MAC) addresses, Electronic Product Codes (EPCs), Bluetooth hardware device addresses, a Universal Resource Locators (URLs), and/or email addresses.
- Example H01 includes a method comprising: determining or identifying one or more features from training websites with known classifications; training a machine learning (ML) model with the features and known classifications; determining or identifying the features from an unclassified website with an unknown classification; and applying the features from an unclassified website to the trained computer learning model to predict a classification for the unclassified website.
- Example H02 includes the method of example H01 and/or some other example(s) herein, further comprising: generating a first set of vectors representing the features of the training websites; using the first set of vectors and known classifications of the training websites to train the computer learning model; generating a second set of vectors representing the features of the unclassified website; and applying the second set of vectors to the trained computer learning model to classify the unclassified website.
- Example H03 includes the method of examples H01-H02 and/or some other example(s) herein, wherein one of the features identifies structural semantics of webpages in the websites.
- Example H04 includes the method of example H03 and/or some other example(s) herein, further comprising: crawling the webpages of the unclassified website to identify links between the webpages on the website and links with other webpages on the same website and links with webpages on other websites; and determining or identifying the structural semantics of the website based on the identified links.
- Example H05 includes the method of examples H01-H04 and/or some other example(s) herein, further comprising: generating one of the features that identify content semantics of webpages in the websites.
- Example H06 includes the method of example H05 and/or some other example(s) herein, further comprising: crawling the webpages of the unclassified website to identify types of content and topics in the webpages; and determining or identifying the content semantics of the website based on the identified types of content and topics in the webpages.
- Example H07 includes the method of examples H01-H06 and/or some other example(s) herein, further comprising: generating one of the features that identify content interaction behavior with webpages in the websites.
- Example H08 includes the method of example H07 and/or some other example(s) herein, further comprising: determining or identifying events associated with the webpages of the websites; determining or identifying types of user interactions with the webpages identified in the events; and determining or identifying the content interaction behavior based on the types of user interactions with the webpages.
- Example H09 includes the method of examples H01-H08 and/or some other example(s) herein, further comprising: generating one of the features that identifies types of users accessing webpages in the websites.
- Example H10 includes the method of example H09 and/or some other example(s) herein, further comprising: determining or identifying events associated with the webpages of the websites; determining or identifying types of users associated with the events; and determining or identifying the types of users accessing the webpages based on the types of users identified in the events.
- Example H11 includes a method comprising: determining or identifying a website semantic feature for a website; determining or identifying a website behavioral feature for the website; and predicting a classification for the website based on the website semantic feature and the website behavioral feature.
- Example H12 includes the method of example H11 and/or some other example(s) herein, further comprising: generating a first vector representing the website semantic feature of the website; generating a second vector representing the website behavioral feature of the website; and feeding the first and second vector into a computer learning model to predict the classification for the website.
- Example H13 includes the method of examples H11-H12 and/or some other example(s) herein, further comprising: generating the website semantic feature for the website based on links between webpages on the website.
- Example H14 includes the method of example H13 and/or some other example(s) herein, further comprising: generating the website semantic feature for the website based on content and topics in the webpages on the website.
- Example H15 includes the method of examples H11-H14 and/or some other example(s) herein, further comprising: generating the website behavioral feature for the website based on types of user interactions with webpages on the website.
- Example H16 includes the method of example H15 and/or some other example(s) herein, further comprising: generating the website behavioral feature for the website based on types of businesses accessing the webpages on the website
- Example I01 includes a method of machine learning (ML) comprising: determining or identifying one or more features from training data comprising a set of information objects (InObs) with known classifications, each InOb of the set of InObs comprising one or more nodes, the one or more features including structural semantics for respective InObs of the set of InObs, the structural semantics comprising a data structure representative of relationships between the one or more nodes of the respective InObs; training an ML model to identify classifications of InObs not among the set of InObs based on the features identified from the training data and the known classifications of the set of InObs; determining or identifying features from an unclassified InOb with an unknown classification, the identified features of the unclassified InOb including a set of nodes of the unclassified InOb; and applying the identified features of the unclassified InOb to the trained ML model to predict a classification for the unclassified InOb based on structural semantics of the unclassified InOb, the structural semantics of the unclassified InOb being based on relationships among nodes of the set of nodes.
- Example I02 includes the method of example I01 and/or some other example(s) herein, further comprising: generating a first set of vectors representing the features of the set of InObs; using the first set of vectors and known classifications of the set of InObs to train the ML model; generating a second set of vectors representing the features of the unclassified InOb; and applying the second set of vectors to the trained ML model to classify the unclassified InOb.
- Example I03 includes the method of examples I01-I02 and/or some other example(s) herein, wherein the structural semantics of the respective InObs includes relationships between nodes making individual InObs and relationships between nodes of different InObs.
- Example I04 includes the method of example I03 and/or some other example(s) herein, further comprising: crawling the webpages of the unclassified InOb to identify links between the webpages on the InOb and links with other webpages on the same InOb and links with webpages on other InObs; and determining or identifying the structural semantics of the unclassified InOb based on the identified links.
- Example I05 includes the method of examples I01-I04 and/or some other example(s) herein, wherein the one or more features further comprise content semantics of the one or more nodes of the set of InObs.
- Example I06 includes the method of example I05 and/or some other example(s) herein, further comprising: crawling the webpages of the unclassified InOb to identify content types and topics in the webpages; and determining or identifying the content semantics of the unclassified InOb based on the identified content types and topics in the webpages of the unclassified InOb.
- Example I07 includes the method of examples I01-I06 and/or some other example(s) herein, wherein the one or more features further comprise content interaction behavior features with webpages in the one or more nodes of the set of InObs.
- Example I08 includes the method of example I07 and/or some other example(s) herein, further comprising: determining or identifying user interaction events generated by the one or more nodes based on interactions with the one or more nodes of the set of InObs; determining or identifying user interaction types based on the user interaction events; and determining or identifying the content interaction behavior features based on the user interaction types of the set of webpages.
- Example I09 includes the method of examples I01-I08 and/or some other example(s) herein, wherein the one or more features further comprise types of users accessing the one or more nodes of the set of InObs, the types of users including device types used for accessing the one or more nodes.
- Example I10 includes the method of example I09 and/or some other example(s) herein, further comprising: determining or identifying network session events generated by the one or more nodes based on accesses of the one or more nodes the InObs; determining or identifying user data from the network session events; and determining or identifying the types of users accessing the webpages based on the determined user data.
- Example I11 includes a method comprising: determining or identifying, using a trained machine learning (ML) model, one or more structural features of a InOb, the trained ML model being trained on a training data set including a set of InObs, each InOb of the set of InObs comprising one or more nodes, and the trained ML model includes a data object indicating structural features of respective InObs of the set of InObs, the structural features are relationships between the one or more nodes of the respective InObs, and the data object is a representation of the relationships; and predicting a classification for the InOb based on the identified one or more structural features of the InOb.
- Example I12 includes the method of example I11 and/or some other example(s) herein, further comprising: determining or identifying user interaction events generated by the InOb or users that interact with the InOb; determining or identifying user interaction types based on the user interaction events; determining or identifying one or more content interaction behavior features for the InOb based on the determined user interaction types, the one or more content interaction behavior features being patterns of user interaction with content of the InOb.
- Example I13 includes the method of example I12 and/or some other example(s) herein, further comprising: generating a structural feature vector comprising the one or more structural features of the InOb; generating a content interaction behavior feature vector comprising the one or more content interaction behavior features of the InOb; and feeding the structural feature vector and the content interaction behavior feature vector into the ML model to predict the classification for the InOb.
- Example I14 includes the method of example I13 and/or some other example(s) herein, wherein the user interaction events indicate an event type and an engagement metric, and each content interaction behavior feature in the content interaction behavior feature vector represents a percentage or average value of the engagement metric for an associated event type for a time period.
- Example I15 includes the method of examples I13-I14 and/or some other example(s) herein, wherein the one or more content interaction behavior features include one or more of a time of day, day of week, date, total amount of content consumed by respective users, percentages of different device types used for accessing the InOb, duration of time users spend on individual webpages of the InOb, total engagement the respective users have on the individual webpages, a number of distinct user profiles accessing the individual webpages versus a total number of user interaction events for the individual webpages, a dwell time, a scroll depth, a scroll velocity, and variance in content consumption over time.
- Example I16 includes the method of examples I13-I15 and/or some other example(s) herein, wherein generating the structural feature vector comprises: generating respective structural feature vectors for each individual webpage of the InOb; and averaging the respective structural feature vectors for each individual webpage to obtain the structural feature vector for the InOb.
- Example I17 includes the method of examples I13-I16 and/or some other example(s) herein, wherein generating the content interaction behavior feature vector comprises: generating respective content interaction behavior feature vectors for each individual webpage of the InOb; and averaging the respective content interaction behavior feature vectors for each individual webpage to obtain the content interaction behavior feature vector for the InOb.
- Example I18 includes the method of examples I12-I17 and/or some other example(s) herein, further comprises: generating the one or more content interaction behavior features for the InOb based on types of businesses accessing webpages of the InOb.
- Example I19 includes the method of examples I11-I18 and/or some other example(s) herein, further comprises: determining or identifying the one or more structural features of the InOb based on links between webpages of the InOb and links to other webpages of other InObs from the webpages of the InOb.
- Example I20 includes the method of example I19 and/or some other example(s) herein, further comprises: crawling the webpages of the InOb to identify the links between the webpages of the InOb and the links to the other webpages.
- Example I21 includes the method of examples H01-H23, I01-I20, and/or some other example(s) herein, wherein the network addresses is/are internet protocol (IP) addresses, telephone numbers in a public switched telephone number, a cellular network addresses, internet packet exchange (IPX) addresses, X.25 addresses, X.21 addresses, Transmission Control Protocol (TCP) or User Datagram Protocol (UDP) port numbers, media access control (MAC) addresses, Electronic Product Codes (EPCs), Bluetooth hardware device addresses, a Universal Resource Locators (URLs), and/or email addresses.
- Example Z01 includes one or more computer readable media comprising instructions, wherein execution of the instructions by processor circuitry is to cause the processor circuitry to perform the method of any one of examples A01-A28, B01-B21, C01-C02, D01-D21, E01-E21, F01-F23, G01-G20, H01-H23, I01-I20, and/or some other example(s) herein. Example Z02 includes a computer program comprising the instructions of example Z01 and/or some other example(s) herein. Example Z03 includes an Application Programming Interface defining functions, methods, variables, data structures, and/or protocols for the computer program of example Z02 and/or some other example(s) herein. Example Z04 includes an API or specification defining functions, methods, variables, data structures, protocols, and/or the like, defining or involving use of any of examples A01-A28, B01-B21, C01-C02, D01-D21, E01-E21, F01-F23, G01-G20, H01-H23, I01-I20, and/or portions thereof, or otherwise related to any of examples A01-A28, B01-B21, C01-C02, D01-D21, E01-E21, F01-F23, G01-G20, H01-H23, I01-I20, and/or portions thereof. Example Z05 includes an apparatus comprising circuitry loaded with the instructions of example Z01 and/or some other example(s) herein. Example Z06 includes an apparatus comprising circuitry operable to run the instructions of example Z01 and/or some other example(s) herein. Example Z07 includes an integrated circuit comprising one or more of the processor circuitry of example Z01 and the one or more computer readable media of example Z01 and/or some other example(s) herein. Example Z08 includes a computing system comprising the one or more computer readable media and the processor circuitry of example Z01 and/or some other example(s) herein. Example Z09 includes a computing system of example Z08 and/or one or more other example(s) herein, wherein the computing system is a System-in-Package (SiP), Multi-Chip Package (MCP), a System-on-Chips (SoC), a digital signal processors (DSP), a field-programmable gate arrays (FPGA), an Application Specific Integrated Circuits (ASIC), a programmable logic device (PLD), a complex PLD (CPLD), a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), and/or the computing system comprises two or more of SiPs, MCPs, SoCs, DSPs, FPGAs, ASICs, PLDs, CPLDs, CPUs, GPUs interconnected with one another. Example Z10 includes an apparatus comprising means for executing the instructions of example Z01 and/or some other example(s) herein. Example Z11 includes a signal generated as a result of executing the instructions of example Z01 and/or some other example(s) herein. Example Z12 includes a data unit generated as a result of executing the instructions of example Z01 and/or some other example(s) herein. Example Z13 includes the data unit of example Z12 and/or some other example(s) herein, wherein the data unit is a datagram, network packet, data frame, data segment, a Protocol Data Unit (PDU), a Service Data Unit (SDU), a message, or a database object. Example Z14 includes a signal encoded with the data unit of examples Z12-Z13 and/or some other example(s) herein. Example Z14 includes an electromagnetic signal carrying the instructions of example Z01 and/or some other example(s) herein. Example Z15 includes an apparatus comprising means for performing the method of any one of examples A01-A28, B01-B21, C01-C02, D01-D21, E01-E21, F01-F23, G01-G20, H01-H23, I01-I20, and/or some other example(s) herein.
- Any of the above-described examples may be combined with any other example (or combination of examples), unless explicitly stated otherwise. Implementation of the preceding techniques may be accomplished through any number of specifications, configurations, or example deployments of hardware and software. It should be understood that the functional units or capabilities described in this specification may have been referred to or labeled as components or modules, in order to more particularly emphasize their implementation independence. Such components may be embodied by any number of software or hardware forms. For example, a component or module may be implemented as a hardware circuit comprising custom very-large-scale integration (VLSI) circuits or gate arrays, off-the-shelf semiconductors such as logic chips, transistors, or other discrete components. A component or module may also be implemented in programmable hardware devices such as field programmable gate arrays, programmable array logic, programmable logic devices, or the like. Components or modules may also be implemented in software for execution by various types of processors. An identified component or module of executable code may, for instance, comprise one or more physical or logical blocks of computer instructions, which may, for instance, be organized as an object, procedure, or function. Nevertheless, the executables of an identified component or module need not be physically located together, but may comprise disparate instructions stored in different locations which, when joined logically together, comprise the component or module and achieve the stated purpose for the component or module.
- Indeed, a component or module of executable code may be a single instruction, or many instructions, and may even be distributed over several different code segments, among different programs, and across several memory devices or processing systems. In particular, some aspects of the described process (such as code rewriting and code analysis) may take place on a different processing system (e.g., in a computer in a data center), than that in which the code is deployed (e.g., in a computer embedded in a sensor or robot). Similarly, operational data may be identified and illustrated herein within components or modules, and may be embodied in any suitable form and organized within any suitable type of data structure. The operational data may be collected as a single data set, or may be distributed over different locations including over different storage devices, and may exist, at least partially, merely as electronic signals on a system or network. The components or modules may be passive or active, including agents operable to perform desired functions.
- The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The present disclosure has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and/or computer program products according to embodiments of the present disclosure. In the drawings, some structural or method features may be shown in specific arrangements and/or orderings. However, it should be appreciated that such specific arrangements and/or orderings may not be required. Rather, in some embodiments, such features may be arranged in a different manner and/or order than shown in the illustrative figures. Additionally, the inclusion of a structural or method feature in a particular figure is not meant to imply that such feature is required in all embodiments and, in some embodiments, may not be included or may be combined with other features.
- For purposes of the present disclosure, the singular forms “a,” “an” and “the” are intended to include plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specific the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operation, elements, components, and/or groups thereof. The phrase “A and/or B” means (A), (B), or (A and B). For the purposes of the present disclosure, the phrase “A, B, and/or C” means (A), (B), (C), (A and B), (A and C), (B and C), or (A, B and C). The description may use the phrases “in an embodiment,” or “In some embodiments,” which may each refer to one or more of the same or different embodiments. Furthermore, the terms “comprising,” “including,” “having,” and the like, as used with respect to embodiments of the present disclosure, are synonymous.
- The terms “coupled,” “communicatively coupled,” along with derivatives thereof are used herein. The term “coupled” may mean two or more elements are in direct physical or electrical contact with one another, may mean that two or more elements indirectly contact each other but still cooperate or interact with each other, and/or may mean that one or more other elements are coupled or connected between the elements that are said to be coupled with each other. The term “directly coupled” may mean that two or more elements are in direct contact with one another. The term “communicatively coupled” may mean that two or more elements may be in contact with one another by a means of communication including through a wire or other interconnect connection, through a wireless communication channel or ink, and/or the like.
- The term “establish” or “establishment” at least in some examples refers to (partial or in full) acts, tasks, operations, and/or the like, related to bringing or the readying the bringing of something into existence either actively or passively (e.g., exposing a device identity or entity identity). Additionally or alternatively, the term “establish” or “establishment” at least in some examples refers to (partial or in full) acts, tasks, operations, and/or the like, related to initiating, starting, or warming communication or initiating, starting, or warming a relationship between two entities or elements (e.g., establish a session, establish a session, and/or the like). Additionally or alternatively, the term “establish” or “establishment” at least in some examples refers to initiating something to a state of working readiness. The term “established” at least in some examples refers to a state of being operational or ready for use (e.g., full establishment). Furthermore, any definition for the term “establish” or “establishment” defined in any specification or standard can be used for purposes of the present disclosure and such definitions are not disavowed by any of the aforementioned definitions.
- The term “obtain” at least in some examples refers to (partial or in full) acts, tasks, operations, and/or the like, of intercepting, movement, copying, retrieval, or acquisition (e.g., from a memory, an interface, or a buffer), on the original packet stream or on a copy (e.g., a new instance) of the packet stream. Other aspects of obtaining or receiving may involving instantiating, enabling, or controlling the ability to obtain or receive the stream of packets (or the following parameters and templates or template values).
- The term “receipt” at least in some examples refers to any action (or set of actions) involved with receiving or obtaining an object, data, data unit, and/or the like, and/or the fact of the object, data, data unit, and/or the like being received. The term “receipt” at least in some examples refers to an object, data, data unit, and the like, being pushed to a device, system, element (e.g., often referred to as a push model), pulled by a device, system, element (e.g., often referred to as a pull model), and/or the like.
- The term “element” at least in some examples refers to a unit that is indivisible at a given level of abstraction and has a clearly defined boundary, wherein an element may be any type of entity including, for example, one or more devices, systems, controllers, network elements, modules, and/or the like, or combinations thereof.
- The term “measurement” at least in some examples refers to the observation and/or quantification of attributes of an object, event, or phenomenon. Additionally or alternatively, the term “measurement” at least in some examples refers to a set of operations having the object of determining a measured value or measurement result, and/or the actual instance or execution of operations leading to a measured value.
- The term “signal” at least in some examples refers to an observable change in a quality and/or quantity. Additionally or alternatively, the term “signal” at least in some examples refers to a function that conveys information about of an object, event, or phenomenon. Additionally or alternatively, the term “signal” at least in some examples refers to any time varying voltage, current, or electromagnetic wave that may or may not carry information. The term “digital signal” at least in some examples refers to a signal that is constructed from a discrete set of waveforms of a physical quantity so as to represent a sequence of discrete values.
- The term “identifier” at least in some examples refers to a value, or a set of values, that uniquely identify an identity in a certain scope. Additionally or alternatively, the term “identifier” at least in some examples refers to a sequence of characters that identifies or otherwise indicates the identity of a unique object, element, or entity, or a unique class of objects, elements, or entities. Additionally or alternatively, the term “identifier” at least in some examples refers to a sequence of characters used to identify or refer to an application, program, session, object, element, entity, variable, set of data, and/or the like. The “sequence of characters” mentioned previously at least in some examples refers to one or more names, labels, words, numbers, letters, symbols, and/or any combination thereof. Additionally or alternatively, the term “identifier” at least in some examples refers to a name, address, label, distinguishing index, and/or attribute. Additionally or alternatively, the term “identifier” at least in some examples refers to an instance of identification. The term “persistent identifier” at least in some examples refers to an identifier that is reused by a device or by another device associated with the same person or group of persons for an indefinite period.
- The term “identification” at least in some examples refers to a process of recognizing an identity as distinct from other identities in a particular scope or context, which may involve processing identifiers to reference an identity in an identity database.
- The terms “ego” (as in, e.g., “ego device”) and “subject” (as in, e.g., “data subject”) at least in some examples refers to an entity, element, device, system, and/or the like, that is under consideration or being considered. The terms “neighbor” and “proximate” (as in, e.g., “proximate device”) at least in some examples refers to an entity, element, device, system, and/or the like, other than an ego device or subject device.
- The term “network path” or “path” at least in some examples refers to a data communications feature of a communication system describing the sequence and identity of system components visited by one or more packets, where the components of the path may be either logical or physical. The term “network forwarding path” at least in some examples refers to an ordered list of connection points forming a chain of NFs and/or nodes, along with policies associated to the list.
- The term “circuitry” at least in some examples refers to a circuit or system of multiple circuits configurable or operable to perform a particular function in an electronic device. The circuit or system of circuits may be part of, or include one or more hardware components, such as a logic circuit, a processor (shared, dedicated, or group) and/or memory (shared, dedicated, or group), an ASIC, a FPGA, programmable logic controller (PLC), SoC, SiP, multi-chip package (MCP), DSP, and/or the like, that are configurable or operable to provide the described functionality. In addition, the term “circuitry” may also refer to a combination of one or more hardware elements with the program code used to carry out the functionality of that program code. Some types of circuitry may execute one or more software or firmware programs to provide at least some of the described functionality. Such a combination of hardware elements and program code may be referred to as a particular type of circuitry.
- The term “processor circuitry” at least in some examples refers to, is part of, or includes circuitry capable of sequentially and automatically carrying out a sequence of arithmetic or logical operations, or recording, storing, and/or transferring digital data. The term “processor circuitry” may refer to one or more application processors, one or more baseband processors, a physical CPU, a single-core processor, a dual-core processor, a triple-core processor, a quad-core processor, and/or any other device capable of executing or otherwise operating computer-executable instructions, such as program code, software modules, and/or functional processes. The terms “application circuitry” and/or “baseband circuitry” may be considered synonymous to, and may be referred to as, “processor circuitry.”
- The term “memory” and/or “memory circuitry” at least in some examples refers to one or more hardware devices for storing data, including RAM, MRAM, PRAM, DRAM, and/or SDRAM, core memory, ROM, magnetic disk storage mediums, optical storage mediums, flash memory devices or other machine readable mediums for storing data. The term “computer-readable medium” may include, but is not limited to, memory, portable or fixed storage devices, optical storage devices, and various other mediums capable of storing, containing or carrying instructions or data. “Computer-readable storage medium” (or alternatively, “machine-readable storage medium”) may include all of the foregoing types of memory, as well as new technologies that may arise in the future, as long as they may be capable of storing digital information in the nature of a computer program or other data, at least temporarily, in such a manner that the stored information may be “read” by an appropriate processing device. The term “computer-readable” may not be limited to the historical usage of “computer” to imply a complete mainframe, mini-computer, desktop, wireless device, or even a laptop computer. Rather, “computer-readable” may comprise storage medium that may be readable by a processor, processing device, or any computing system. Such media may be any available media that may be locally and/or remotely accessible by a computer or processor, and may include volatile and non-volatile media, and removable and non-removable media.
- The term “interface circuitry” at least in some examples refers to, is part of, or includes circuitry that enables the exchange of information between two or more components or devices. The term “interface circuitry” may refer to one or more hardware interfaces, for example, buses, I/O interfaces, peripheral component interfaces, network interface cards, and/or the like.
- The term “device” refers to a physical entity embedded inside, or attached to, another physical entity in its vicinity, with capabilities to convey digital information from or to that physical entity. The term “entity” refers to a distinct component of an architecture or device, or information transferred as a payload. The term “controller” refers to an element or entity that has the capability to affect a physical entity, such as by changing its state or causing the physical entity to move.
- The term “computer system” at least in some examples refers to any type interconnected electronic devices, computer devices, or components thereof. Additionally, the term “computer system” and/or “system” may refer to various components of a computer that are communicatively coupled with one another. Furthermore, the term “computer system” and/or “system” may refer to multiple computer devices and/or multiple computing systems that are communicatively coupled with one another and configurable or operable to share computing and/or networking resources.
- The term “architecture” at least in some examples refers to a computer architecture or a network architecture. A “network architecture” is a physical and logical design or arrangement of software and/or hardware elements in a network including communication protocols, interfaces, and media transmission. A “computer architecture” is a physical and logical design or arrangement of software and/or hardware elements in a computing system or platform including technology standards for interacts therebetween.
- The term “appliance,” “computer appliance,” or the like, at least in some examples refers to a computer device or computer system with program code (e.g., software or firmware) that is specifically designed to provide a specific computing resource. A “virtual appliance” is a virtual machine image to be implemented by a hypervisor-equipped device that virtualizes or emulates a computer appliance or otherwise is dedicated to provide a specific computing resource.
- The term “cloud computing” or “cloud” at least in some examples refers to a paradigm for enabling network access to a scalable and elastic pool of shareable computing resources with self-service provisioning and administration on-demand and without active management by users. Cloud computing provides cloud computing services (or cloud services), which are one or more capabilities offered via cloud computing that are invoked using a defined interface (e.g., an API or the like). The term “cloud service provider” or “CSP” at least in some examples refers to an organization which operates typically large-scale “cloud” resources comprised of centralized, regional, and Edge data centers (e.g., as used in the context of the public cloud). In some examples, a CSP may also be referred to as a “Cloud Service Operator” or “CSO”.
- The term “computing resource” or simply “resource” refers to any physical or virtual component, or usage of such components, of limited availability within a computer system or network. Examples of computing resources include usage/access to, for a period of time, servers, processor(s), storage equipment, memory devices, memory areas, networks, electrical power, input/output (peripheral) devices, mechanical devices, network connections (e.g., channels/links, ports, network sockets, and/or the like), operating systems, virtual machines (VMs), software/applications, computer files, and/or the like. A “hardware resource” may refer to compute, storage, and/or network resources provided by physical hardware element(s). A “virtualized resource” may refer to compute, storage, and/or network resources provided by virtualization infrastructure to an application, device, system, and/or the like. The term “network resource” or “communication resource” may refer to resources that are accessible by computer devices/systems via a communications network. The term “system resources” may refer to any kind of shared entities to provide services, and may include computing and/or network resources. System resources may be considered as a set of coherent functions, network data objects or services, accessible through a server where such system resources reside on a single host or multiple hosts and are clearly identifiable.
- The term “service consumer” at least in some examples refers to an entity that consumes one or more services. The term “service producer” at least in some examples refers to an entity that offers, serves, or otherwise provides one or more services. The term “service provider” at least in some examples refers to an organization or entity that provides one or more services to at least one service consumer. For purposes of the present disclosure, the terms “service provider” and “service producer” may be used interchangeably even though these terms may refer to difference concepts.
- The term “Virtualized Infrastructure Manager” or “VIM” at least in some examples refers to a functional block that is responsible for controlling and managing the NFVI compute, storage and network resources, usually within one operator's infrastructure domain.
- The term “virtualization container”, “execution container”, or “container” at least in some examples refers to a partition of a compute node that provides an isolated virtualized computation environment. The term “OS container” at least in some examples refers to a virtualization container utilizing a shared Operating System (OS) kernel of its host, where the host providing the shared OS kernel can be a physical compute node or another virtualization container. Additionally or alternatively, the term “container” at least in some examples refers to a standard unit of software (or a package) including code and its relevant dependencies, and/or an abstraction at the application layer that packages code and dependencies together. Additionally or alternatively, the term “container” or “container image” at least in some examples refers to a lightweight, standalone, executable software package that includes everything needed to run an application such as, for example, code, runtime environment, system tools, system libraries, and settings.
- The term “virtual machine” or “VM” at least in some examples refers to a virtualized computation environment that behaves in a same or similar manner as a physical computer and/or a server. The term “hypervisor” at least in some examples refers to a software element that partitions the underlying physical resources of a compute node, creates VMs, manages resources for VMs, and isolates individual VMs from each other.
- The term “data processing” or “processing” at least in some examples refers to any operation or set of operations which is performed on data or on sets of data, whether or not by automated means, such as collection, recording, writing, organization, structuring, storing, adaptation, alteration, retrieval, consultation, use, disclosure by transmission, dissemination or otherwise making available, alignment or combination, restriction, erasure and/or destruction. The term “data pipeline” or “pipeline” at least in some examples refers to a set of data processing elements (or data processors) connected in series and/or in parallel, where the output of one data processing element is the input of one or more other data processing elements in the pipeline; the elements of a pipeline may be executed in parallel or in time-sliced fashion and/or some amount of buffer storage can be inserted between elements. The term “packet processor” at least in some examples refers to software and/or hardware element(s) that transform a stream of input packets into output packets (or transforms a stream of input data into output data); examples of the transformations include adding, removing, and modifying fields in a packet header, trailer, and/or payload.
- The term “filter” at least in some examples refers to computer program, subroutine, or other software element capable of processing a stream, data flow, or other collection of data, and producing another stream. In some implementations, multiple filters can be strung together or otherwise connected to form a pipeline.
- The terms “instantiate,” “instantiation,” and the like at least in some examples refers to the creation of an instance. An “instance” also at least in some examples refers to a concrete occurrence of an object, which may occur, for example, during execution of program code.
- The term “information object” refers to a data structure that includes one or more data elements. each of which includes one or more data values. Examples of information objects include electronic documents, database objects, data files, resources, webpages, web forms, applications (e.g., web apps), services, web services, media, or content, and/or the like. Information objects may be stored and/or processed according to a data format. Data formats define the content/data and/or the arrangement of data elements for storing and/or communicating the information objects. Each of the data formats may also define the language, syntax, vocabulary, and/or protocols that govern information storage and/or exchange. Examples of the data formats that may be used for any of the information objects discussed herein may include Accelerated Mobile Pages Script (AMPscript), Abstract Syntax Notation One (ASN.1), Backus-Naur Form (BNF), extended BNF, Bencode, BSON, ColdFusion Markup Language (CFML), comma-separated values (CSV), Control Information Exchange Data Model (C2IEDM), Cascading Stylesheets (CSS), DARPA Agent Markup Language (DAML), Document Type Definition (DTD), Electronic Data Interchange (EDI), Extensible Data Notation (EDN), Extensible Markup Language (XML), Efficient XML Interchange (EXI), Extensible Stylesheet Language (XSL), Free Text (FT), Fixed Word Format (FWF), Cisco® Etch, Franca Interface Definition Language (IDL), Geography Markup Language (GML), Geospatial eXtensible Access Control Markup Language (GeoXACML), Geospatial Data Abstraction Library (GDAL), Guide Template Language (GTL), Handlebars template language, Hypertext Markup Language (HTML), Interactive Financial Exchange (IFX), Keyhole Markup Language (KML) and/or KML Zipped (KMZ), JAMscript, Java Script Object Notion (JSON), JSON Schema Language, Apache® MessagePack™, Mustache template language, Ontology Interchange Language (OIL), Open Service Interface Definition, Open Financial Exchange (OFX), Precision Graphics Markup Language (PGML), Google® Protocol Buffers (protobuf), Quicken® Financial Exchange (QFX), Regular Language for XML Next Generation (RelaxNG) schema language, regular expressions, Resource Description Framework (RDF) schema language, RESTful Service Description Language (RSDL), Scalable Vector Graphics (SVG), Schematron, Shapefile (SHP), VBScript, text file (TXT), Web Application Description Language (WADL), Web Map Service (WMS), Web Ontology Language (OWL), Web Services Description Language (WSDL), wiki markup or Wikitext, Wireless Markup Language (WML), extensible HTML (XHTML), XPath, XQuery, XML DTD language, XML Schema Definition (XSD), XML Schema Language, XSL Transformations (XSLT), YAML (“Yet Another Markup Language” or “YANL Ain′t Markup Language”), Apache® Thrift, and/or any other language discussed elsewhere herein. Additionally or alternatively, the data format for the information objects discussed herein may be a Tactical Data Link (TDL) format including, for example, J-series message format for Link 16; JREAP messages; Multifuction Advanced Data Link (MADL), Integrated Broadcast Service/Common Message Format (IBS/CMF), Over-the-Horizon Targeting Gold (OTH-T Gold), Variable Message Format (VMF), United States Message Text Format (USMTF), and any future advanced TDL formats. Additionally or alternatively, the data format for the information objects may be document and/or plain text, spreadsheet, graphics, and/or presentation formats including, for example, American National Standards Institute (ANSI) text, a Computer-Aided Design (CAD) application file format (e.g., “.c3d”, “.dwg”, “.dft”, “.iam”, “.iaw”, “.tct”, and/or other like file extensions), Google® Drive® formats (including associated formats for Google Docs®, Google Forms®, Google Sheets®, Google Slides®, and/or the like), Microsoft® Office® formats (e.g., “.doc”, “.ppt”, “.xls”, “.vsd”, and/or other like file extension), OpenDocument Format (including associated document, graphics, presentation, and spreadsheet formats), Open Office XML (OOXML) format (including associated document, graphics, presentation, and spreadsheet formats), Apple® Pages®, Portable Document Format (PDF), Question Object File Format (QUOX), Rich Text File (RTF), TeX and/or LaTeX (“.tex” file extension), text file (TXT), TurboTax® file (“.tax” file extension), You Need a Budget (YNAB) file, and/or any other like document or plain text file format. Additionally or alternatively, the data format for the information objects may be archive file formats that store metadata and concatenate files, and may or may not compress the files for storage. As used herein, the term “archive file” refers to a file having a file format or data format that combines or concatenates one or more files into a single file or information object. Archive files often store directory structures, error detection and correction information, arbitrary comments, and sometimes use built-in encryption. The term “archive format” refers to the data format or file format of an archive file, and may include, for example, archive-only formats that store metadata and concatenate files, for example, including directory or path information; compression-only formats that only compress a collection of files; software package formats that are used to create software packages (including self-installing files), disk image formats that are used to create disk images for mass storage, system recovery, and/or other like purposes; and multi-function archive formats that can store metadata, concatenate, compress, encrypt, create error detection and recovery information, and package the archive into self-extracting and self-expanding files. For the purposes of the present disclosure, the term “archive file” may refer to an archive file having any of the aforementioned archive format types. Examples of archive file formats may include Android® Package (APK); Microsoft® Application Package (APPX); Genie Timeline Backup Index File (GBP); Graphics Interchange Format (GIF); gzip (.gz) provided by the GNU Project™; Java® Archive (JAR); Mike O'Brien Pack (MPQ) archives; Open Packaging Conventions (OPC) packages including OOXML files, OpenXPS files, and/or the like; Rar Archive (RAR); Red Hat® package/installer (RPM); Google® SketchUp backup File (SKB); TAR archive (“.tar”); XPlnstall or XPI installer modules; ZIP (.zip or .zipx); and/or the like.
- The term “data element” at least in some examples refers to an atomic state of a particular object with at least one specific property at a certain point in time, and may include one or more of a data element name or identifier, a data element definition, one or more representation terms, enumerated values or codes (e.g., metadata), and/or a list of synonyms to data elements in other metadata registries. Additionally or alternatively, a “data element” may refer to a data type that contains one single data. Data elements may store data, which may be referred to as the data element's content (or “content items”). Content items may include text content, attributes, properties, and/or other elements referred to as “child elements.” Additionally or alternatively, data elements may include zero or more properties and/or zero or more attributes, each of which may be defined as database objects (e.g., fields, records, and/or the like), object instances, and/or other data elements. An “attribute” may refer to a markup construct including a name-value pair that exists within a start tag or empty element tag. Attributes contain data related to its element and/or control the element's behavior.
- The term “database object” at least in some examples refers to any representation of information that is in the form of an object, attribute-value pair (AVP), key-value pair (KVP), tuple, and the like, and may include variables, data structures, functions, methods, classes, database records, database fields, database entities, associations between data and/or database entities (also referred to as a “relation”), blocks in block chain implementations, and links between blocks in block chain implementations. Furthermore, a database object may include a number of records, and each record may include a set of fields. A database object can be unstructured or have a structure defined by a DBMS (a standard database object) and/or defined by a user (a custom database object). In some implementations, a record may take different forms based on the database model being used and/or the specific database object to which it belongs. For example, a record may be: 1) a row in a table of a relational database; 2) a JavaScript Object Notation (JSON) object; 3) an Extensible Markup Language (XML) document; 4) a KVP; and the like.
- The term “data point” at least in some examples refers to a single piece of information. The term “data set” or “dataset” at least in some examples refers to a collection of data; a “data set” or “dataset” may be formed or arranged in any type of data structure. In some examples, one or more characteristics can define or influence the structure and/or properties of a dataset such as the number and types of attributes and/or variables, and various statistical measures (e.g., standard deviation, kurtosis, and/or the like).
- The term “personal data,” “personally identifiable information,” “PII,” or the like refers to information that relates to an identified or identifiable individual. Additionally or alternatively, “personal data,” “personally identifiable information,” “PII,” or the like refers to information that can be used on its own or in combination with other information to identify, contact, or locate a person, or to identify an individual in context. The term “sensitive data” may refer to data related to racial or ethnic origin, political opinions, religious or philosophical beliefs, or trade union membership, genetic data, biometric data, data concerning health, and/or data concerning a natural person's sex life or sexual orientation. The term “confidential data” refers to any form of information that a person or entity is obligated, by law or contract, to protect from unauthorized access, use, disclosure, modification, or destruction. Additionally or alternatively, “confidential data” may refer to any data owned or licensed by a person or entity that is not intentionally shared with the general public or that is classified by the person or entity with a designation that precludes sharing with the general public.
- The term “pseudonymization” or the like refers to any means of processing personal data or sensitive data in such a manner that the personal/sensitive data can no longer be attributed to a specific data subject (e.g., person or entity) without the use of additional information. The additional information may be kept separately from the personal/sensitive data and may be subject to technical and organizational measures to ensure that the personal/sensitive data are not attributed to an identified or identifiable natural person.
- The term “application” may refer to a complete and deployable package, environment to achieve a certain function in an operational environment. The term “AI/ML application” or the like may be an application that contains some AI/ML models and application-level descriptions. The term “machine learning” or “ML” refers to the use of computer systems implementing algorithms and/or statistical models to perform specific task(s) without using explicit instructions, but instead relying on patterns and inferences. ML algorithms build or estimate mathematical model(s) (referred to as “ML models” or the like) based on sample data (referred to as “training data,” “model training information,” or the like) in order to make predictions or decisions without being explicitly programmed to perform such tasks. Generally, an ML algorithm is a computer program that learns from experience with respect to some task and some performance measure, and an ML model may be any object or data structure created after an ML algorithm is trained with one or more training datasets. After training, an ML model may be used to make predictions on new datasets. Although the term “ML algorithm” refers to different concepts than the term “ML model,” these terms as discussed herein may be used interchangeably for the purposes of the present disclosure. The term “session” refers to a temporary and interactive information interchange between two or more communicating devices, two or more application instances, between a computer and user, or between any two or more entities or elements.
- The term “network address” at least in some examples refers to an identifier for a node or host in a computer network, and may be a unique identifier across a network and/or may be unique to a locally administered portion of the network. The term “application identifier”, “application ID”, or “app ID” at least in some examples refers to an identifier that can be mapped to a specific application or application instance; in the context of 3GPP 5G/NR systems, an “application identifier” at least in some examples refers to an identifier that can be mapped to a specific application traffic detection rule. Additionally or alternatively, the term “application identifier”, “application ID”, or “app ID” at least in some examples refers to a collection of entry points and/or data structures that an application program can access when translated into an application executable. The term “endpoint address” at least in some examples refers to an address used to determine the host/authority part of a target URI, where the target URI is used to access an NF service (e.g., to invoke service operations) of an NF service producer or for notifications to an NF service consumer.
- Examples of identifiers that could be used as a network address, app ID, endpoint address, and/or any other identifier discussed herein include a Closed Access Group Identifier (CAG-ID), Bluetooth hardware device address (BD_ADDR), a cellular network address (e.g., Access Point Name (APN), AMF identifier (ID), AF-Service-Identifier, Edge Application Server (EAS) ID, Data Network Access Identifier (DNAI), Data Network Name (DNN), EPS Bearer Identity (EBI), Equipment Identity Register (EIR) and/or 5G-EIR, Extended Unique Identifier (EUI), Group ID for Network Selection (GIN), Generic Public Subscription Identifier (GPSI), Globally Unique AMF Identifier (GUAMI), Globally Unique Temporary Identifier (GUTI) and/or 5G-GUTI, GPRS tunneling protocol (GTP) tunnel endpoint identifier (TEID) (GTP), Radio Network Temporary Identifier (RNTI) (and variants thereof), International Mobile Equipment Identity (IMEI), IMEI Type Allocation Code (IMEA/TAC), International Mobile Subscriber Identity (IMSI), IMSI software version (IMSISV), permanent equipment identifier (PEI), Local Area Data Network (LADN) DNN, Mobile Subscriber Identification Number (MSIN), Mobile Subscriber/Station ISDN Number (MSISDN), network identifier (NID), Network Slice Instance (NSI) ID, Network Slice Selection Assistance Information (NSSAI), Single NSSAI (S-NSSAI), Permanent Equipment Identifier (PEI), Public Land Mobile Network (PLMN) ID, QoS Flow ID (QFI) and/or 5G QoS Identifier (5QI), RAN ID, Routing Indicator, SMS Function (SMSF) ID, Stand-alone Non-Public Network (SNPN) ID, Subscription Concealed Identifier (SUCI), Subscription Permanent Identifier (SUPI), Temporary Mobile Subscriber Identity (TMSI) and variants thereof, UE Access Category and Identity, and/or other cellular network related identifiers), a connection endpoint identifier (CEPID), an email address, Enterprise Application Server (EAS) ID, an endpoint address, an Electronic Product Code (EPC) as defined by the EPCglobal Tag Data Standard, a Fully Qualified Domain Name (FQDN), flow ID, flow label (e.g., IPv6 flow label, Flexilink flow label, and the like), an ICMP identifier, Intelligence-Defined Network (IDN) identifier(s), an internet protocol (IP) address in an IP network (e.g., IP version 4 (Ipv4), IP version 6 (IPv6), and/or the like), an Information-Centric Networking (ICN) name (data packet identifier), an internet packet exchange (IPX) address, Local Area Network (LAN) ID, a media access control (MAC) address, Multiprotocol Label Switching (MPLS) labels, personal area network (PAN) ID, a port number (e.g., Transmission Control Protocol (TCP) port number, User Datagram Protocol (UDP) port number), Preferred Path Route (PPR) Identifier, PPR segment ID (SID), QUIC connection ID, RFID tag, service set identifier (SSID) and variants thereof, socket address, telephone numbers in a public switched telephone network (PTSN), a socket address, universally unique identifier (UUID) (e.g., as specified in ISO/IEC 11578:1996), a Universal Resource Locator (URL) and/or Universal Resource Identifier (URI), Virtual LAN (VLAN) ID, an X.21 address, an X.25 address, Zigbee® ID, Zigbee® Device Network ID, and/or any other suitable network address and components thereof.
- The term “organization” or “org” refers to an entity comprising one or more people and/or users and having a particular purpose, such as, for example, a company, an enterprise, an institution, an association, a regulatory body, a government agency, a standards body, and/or the like. Additionally or alternatively, an “org” may refer to an identifier that represents an entity/organization and associated data within an instance and/or data structure.
- The term “intent data” may refer to data that is collected about users' observed behavior based on web content consumption, which provides insights into their interests and indicates potential intent to take an action.
- The term “engagement” refers to a measureable or observable user interaction with a content item or information object. The term “engagement rate” refers to the level of user interaction that is generated from a content item or information object. For purposes of the present disclosure, the term “engagement” may refer to the amount of interactions with content or information objects generated by an organization or entity, which may be based on the aggregate engagement of users associated with that organization or entity.
- The term “session” refers to a temporary and interactive information interchange between two or more communicating devices, two or more application instances, between a computer and user, or between any two or more entities or elements. Additionally or alternatively, the term “session” may refer to a connectivity service or other service that provides or enables the exchange of data between two entities or elements. Additionally or alternatively, the term “session” may refer to a unit of measurement of a user's actions taken within a period of time and/or with regard to completion of a task. The term “network session” may refer to a session between two or more communicating devices over a network, and a “web session” may refer to a session between two or more communicating devices over the web or the Internet. A “session identifier,” “session ID,” or “session token” refers to a piece of data that is used in network communications to identify a session and/or a series of message exchanges.
- Although the various example embodiments and example implementations have been described with reference to specific exemplary aspects, it will be evident that various modifications and changes may be made to these aspects without departing from the broader scope of the present disclosure. Many of the arrangements and processes described herein can be used in combination or in parallel implementations to provide greater bandwidth/throughput and to support edge services selections that can be made available to the edge systems being serviced. Accordingly, the specification and drawings are to be regarded in an illustrative rather than a restrictive sense. The accompanying drawings that form a part hereof show, by way of illustration, and not of limitation, specific aspects in which the subject matter may be practiced. The aspects illustrated are described in sufficient detail to enable those skilled in the art to practice the teachings disclosed herein. Other aspects may be utilized and derived therefrom, such that structural and logical substitutions and changes may be made without departing from the scope of this disclosure. This Detailed Description, therefore, is not to be taken in a limiting sense, and the scope of various aspects is defined only by the appended claims, along with the full range of equivalents to which such claims are entitled.
- Such aspects of the inventive subject matter may be referred to herein, individually and/or collectively, merely for convenience and without intending to voluntarily limit the scope of this application to any single aspect or inventive concept if more than one is in fact disclosed. Thus, although specific aspects have been illustrated and described herein, it should be appreciated that any arrangement calculated to achieve the same purpose may be substituted for the specific aspects shown. This disclosure is intended to cover any and all adaptations or variations of various aspects.
- Combinations of the above aspects and other aspects not specifically described herein will be apparent to those of skill in the art upon reviewing the above description.
- For the sake of convenience, operations may be described as various interconnected or coupled functional blocks or diagrams. However, there may be cases where these functional blocks or diagrams may be equivalently aggregated into a single logic device, program or operation with unclear boundaries. Having described and illustrated the principles of a preferred embodiment, it should be apparent that the embodiments may be modified in arrangement and detail without departing from such principles. Claim is made to all modifications and variation coming within the spirit and scope of the following claims.
Claims (20)
1. One or more non-transitory computer readable media (NTCRM) comprising instructions for machine learning (ML), wherein execution of the instructions by one or more processors of a computing device is to cause the computing device to:
identify one or more features from training data comprising a set of information objects (InObs) with known classifications, each InOb of the set of InObs comprising one or more nodes, the one or more features including structural semantics for respective InObs of the set of InObs, the structural semantics comprising a data structure representative of relationships between the one or more nodes of the respective InObs;
train an ML model to identify classifications of InObs not among the set of InObs based on the features identified from the training data and the known classifications of the set of InObs;
identify features from an unclassified InOb with an unknown classification, the identified features of the unclassified InOb including a set of nodes of the unclassified InOb; and
apply the identified features of the unclassified InOb to the trained ML model to predict a classification for the unclassified InOb based on structural semantics of the unclassified InOb, the structural semantics of the unclassified InOb being based on relationships among nodes of the set of nodes.
2. The one or more NTCRM of claim 1 , wherein execution of the instructions is to cause the computing device to:
generate a first set of vectors representing the features of the set of InObs;
use the first set of vectors and known classifications of the set of InObs to train the ML model;
generate a second set of vectors representing the features of the unclassified InOb; and
apply the second set of vectors to the trained ML model to classify the unclassified InOb.
3. The one or more NTCRM of claim 1 , wherein the structural semantics of the respective InObs includes relationships between nodes making individual InObs and relationships between nodes of different InObs.
4. The one or more NTCRM of claim 3 , wherein execution of the instructions is to cause the computing device to:
analyze the InObs of the unclassified InOb to identify links between the InObs on the InOb and links with other InObs on the same InOb and links with InObs on other InObs; and
determine the structural semantics of the unclassified InOb based on the identified links.
5. The one or more NTCRM of claim 1 , wherein the one or more features comprise content semantics of the one or more nodes of the set of InObs.
6. The one or more NTCRM of claim 5 , wherein execution of the instructions is to cause the computing device to:
analyze the InObs of the unclassified InOb to identify content types and topics in the InObs; and
identify the content semantics of the unclassified InOb based on the identified content types and topics in the InObs of the unclassified InOb.
7. The one or more NTCRM of claim 1 , wherein the one or more features further comprise content interaction behavior features with InObs in the one or more nodes of the set of InObs.
8. The one or more NTCRM of claim 7 , wherein execution of the instructions is to cause the computing device to:
identify user interaction events generated by the one or more nodes based on interactions with the one or more nodes of the set of InObs;
determine user interaction types based on the user interaction events; and
identify the content interaction behavior features based on the user interaction types of the set of InObs.
9. The one or more NTCRM of claim 1 , wherein the one or more features further comprise types of users accessing the one or more nodes of the set of InObs, the types of users including device types used for accessing the one or more nodes.
10. The one or more NTCRM of claim 9 , wherein execution of the instructions is to cause the computing device to:
identify network session events generated by the one or more nodes based on accesses of the one or more nodes the InObs;
determine user data from the network session events; and
identify the types of users accessing the InObs based on the determined user data.
11. An apparatus, comprising:
memory circuitry to store instructions; and
processor circuitry communicatively coupled to the memory circuitry, wherein the processor circuitry is to execute the instructions to:
identify, using a trained machine learning (ML) model, one or more structural features of an information object (InOb), the trained ML model being trained on a training data set including a set of InObs, each InOb of the set of InObs comprising one or more nodes, and the trained ML model includes a data object indicating structural features of respective InObs of the set of InObs, the structural features are relationships between the one or more nodes of the respective InObs, and the data object is a representation of the relationships; and
predict a classification for the InOb based on the identified one or more structural features of the InOb.
12. The apparatus of claim 11 , wherein the processor circuitry is to execute the instructions to:
identify user interaction events generated by the InOb or users that interact with the InOb;
determine user interaction types based on the user interaction events;
identify one or more content interaction behavior features for the InOb based on the determined user interaction types, the one or more content interaction behavior features being patterns of user interaction with content of the InOb.
13. The apparatus of claim 12 , wherein the processor circuitry is to execute the instructions to:
generate a structural feature vector comprising the one or more structural features of the InOb;
generate a content interaction behavior feature vector comprising the one or more content interaction behavior features of the InOb; and
feed the structural feature vector and the content interaction behavior feature vector into the ML model to predict the classification for the InOb.
14. The apparatus of claim 13 , wherein the user interaction events indicate an event type and an engagement metric, and each content interaction behavior feature in the content interaction behavior feature vector represents a percentage or average value of the engagement metric for an associated event type for a time period.
15. The apparatus of claim 13 , wherein the one or more content interaction behavior features include one or more of a time of day, day of week, date, total amount of content consumed by respective users, percentages of different device types used for accessing the InOb, duration of time users spend on individual InObs of the InOb, total engagement the respective users have on the individual InObs, a number of distinct user profiles accessing the individual InObs versus a total number of user interaction events for the individual InObs, a dwell time, a scroll depth, a scroll velocity, and variance in content consumption over time.
16. The apparatus of claim 13 , wherein, to generate the structural feature vector, the processor circuitry is to execute the instructions to:
generate respective structural feature vectors for each individual InOb of the InOb; and
average the respective structural feature vectors for each individual InOb to obtain the structural feature vector for the InOb.
17. The apparatus of claim 13 , wherein, to generate the content interaction behavior feature vector, the processor circuitry is to execute the instructions to:
generate respective content interaction behavior feature vectors for each individual InOb of the InOb; and
average the respective content interaction behavior feature vectors for each individual InOb to obtain the content interaction behavior feature vector for the InOb.
18. The apparatus of claim 12 , wherein the processor circuitry is to execute the instructions to:
generate the one or more content interaction behavior features for the InOb based on types of businesses accessing InObs of the InOb.
19. The apparatus of claim 11 , wherein the processor circuitry is to execute the instructions to:
determine the one or more structural features of the InOb based on links between InObs of the InOb and links to other InObs of other InObs from the InObs of the InOb.
20. The apparatus of claim 19 , wherein the processor circuitry is to execute the instructions to:
analyze the InObs of the InOb to identify the links between the InObs of the InOb and the links to the other InObs.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US18/168,440 US20230232052A1 (en) | 2014-09-26 | 2023-02-13 | Machine learning techniques for detecting surges in content consumption |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/498,056 US9940634B1 (en) | 2014-09-26 | 2014-09-26 | Content consumption monitor |
US14/981,529 US20160132906A1 (en) | 2014-09-26 | 2015-12-28 | Surge detector for content consumption |
US17/189,073 US11589083B2 (en) | 2014-09-26 | 2021-03-01 | Machine learning techniques for detecting surges in content consumption |
US18/168,440 US20230232052A1 (en) | 2014-09-26 | 2023-02-13 | Machine learning techniques for detecting surges in content consumption |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/189,073 Continuation US11589083B2 (en) | 2014-09-26 | 2021-03-01 | Machine learning techniques for detecting surges in content consumption |
Publications (1)
Publication Number | Publication Date |
---|---|
US20230232052A1 true US20230232052A1 (en) | 2023-07-20 |
Family
ID=85222640
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/189,073 Active US11589083B2 (en) | 2014-09-26 | 2021-03-01 | Machine learning techniques for detecting surges in content consumption |
US18/168,440 Abandoned US20230232052A1 (en) | 2014-09-26 | 2023-02-13 | Machine learning techniques for detecting surges in content consumption |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/189,073 Active US11589083B2 (en) | 2014-09-26 | 2021-03-01 | Machine learning techniques for detecting surges in content consumption |
Country Status (1)
Country | Link |
---|---|
US (2) | US11589083B2 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210357531A1 (en) * | 2019-02-05 | 2021-11-18 | Anagog Ltd. | Privacy preserving location tracking |
US20220311528A1 (en) * | 2021-03-23 | 2022-09-29 | Sling TV L.L.C. | Systems and methods for unifying local channels with over-the-top services |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11816465B2 (en) | 2013-03-15 | 2023-11-14 | Ei Electronics Llc | Devices, systems and methods for tracking and upgrading firmware in intelligent electronic devices |
US11734704B2 (en) * | 2018-02-17 | 2023-08-22 | Ei Electronics Llc | Devices, systems and methods for the collection of meter data in a common, globally accessible, group of servers, to provide simpler configuration, collection, viewing, and analysis of the meter data |
US11863589B2 (en) | 2019-06-07 | 2024-01-02 | Ei Electronics Llc | Enterprise security in meters |
US20220335220A1 (en) * | 2021-04-19 | 2022-10-20 | Pavan Korada | Algorithmic topic clustering of data for real-time prediction and look-alike modeling |
US20230133057A1 (en) * | 2021-10-29 | 2023-05-04 | Keysight Technologies, Inc. | System and method for configuring network elements in a design network topology |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090092299A1 (en) * | 2007-10-03 | 2009-04-09 | Siemens Medical Solutions Usa, Inc. | System and Method for Joint Classification Using Feature Space Cluster Labels |
US20110258049A1 (en) * | 2005-09-14 | 2011-10-20 | Jorey Ramer | Integrated Advertising System |
US20130132468A1 (en) * | 2011-11-22 | 2013-05-23 | Olurotimi Azeez | Discovering, organizing, accessing and sharing information in a cloud environment |
US20130166485A1 (en) * | 2011-12-23 | 2013-06-27 | Florian Hoffmann | Automated observational decision tree classifier |
US20140188830A1 (en) * | 2012-12-27 | 2014-07-03 | Sas Institute Inc. | Social Community Identification for Automatic Document Classification |
Family Cites Families (81)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7185065B1 (en) | 2000-10-11 | 2007-02-27 | Buzzmetrics Ltd | System and method for scoring electronic messages |
US20020173971A1 (en) | 2001-03-28 | 2002-11-21 | Stirpe Paul Alan | System, method and application of ontology driven inferencing-based personalization systems |
US20030154398A1 (en) | 2002-02-08 | 2003-08-14 | Eaton Eric Thomas | System for providing continuity between session clients and method therefor |
US7346606B2 (en) | 2003-06-30 | 2008-03-18 | Google, Inc. | Rendering advertisements with documents having one or more topics using user topic interest |
US8321278B2 (en) | 2003-09-30 | 2012-11-27 | Google Inc. | Targeted advertisements based on user profiles and page profile |
WO2006036781A2 (en) * | 2004-09-22 | 2006-04-06 | Perfect Market Technologies, Inc. | Search engine using user intent |
US20080126178A1 (en) | 2005-09-10 | 2008-05-29 | Moore James F | Surge-Based Online Advertising |
US20070124202A1 (en) | 2005-11-30 | 2007-05-31 | Chintano, Inc. | Systems and methods for collecting data and measuring user behavior when viewing online content |
US7949646B1 (en) | 2005-12-23 | 2011-05-24 | At&T Intellectual Property Ii, L.P. | Method and apparatus for building sales tools by mining data from websites |
US7835911B2 (en) | 2005-12-30 | 2010-11-16 | Nuance Communications, Inc. | Method and system for automatically building natural language understanding models |
US8019777B2 (en) | 2006-03-16 | 2011-09-13 | Nexify, Inc. | Digital content personalization method and system |
WO2008072093A2 (en) | 2006-12-13 | 2008-06-19 | Quickplay Media Inc. | Mobile media platform |
US8745647B1 (en) | 2006-12-26 | 2014-06-03 | Visible Measures Corp. | Method and system for internet video and rich media behavioral measurement |
US20080313019A1 (en) | 2007-06-14 | 2008-12-18 | Jeffers Martin C | System and method for extracting contact information from website traffic statistics |
US8725712B2 (en) | 2007-07-16 | 2014-05-13 | Nokia Corporation | Context based media content presentation |
US7860878B2 (en) | 2008-02-25 | 2010-12-28 | Yahoo! Inc. | Prioritizing media assets for publication |
US20130151687A1 (en) | 2008-05-28 | 2013-06-13 | Adobe Systems Incorporated | Systems and Methods for Monitoring Content Consumption |
US8515937B1 (en) | 2008-06-30 | 2013-08-20 | Alexa Internet | Automated identification and assessment of keywords capable of driving traffic to particular sites |
WO2010048430A2 (en) | 2008-10-22 | 2010-04-29 | Fwix, Inc. | System and method for identifying trends in web feeds collected from various content servers |
US20140108376A1 (en) * | 2008-11-26 | 2014-04-17 | Google Inc. | Enhanced detection of like resources |
US8069160B2 (en) | 2008-12-24 | 2011-11-29 | Yahoo! Inc. | System and method for dynamically monetizing keyword values |
US8392543B1 (en) | 2009-01-30 | 2013-03-05 | Sprint Communications Company L.P. | Synchronization of content change across multiple devices |
US8412847B2 (en) | 2009-11-02 | 2013-04-02 | Demandbase, Inc. | Mapping network addresses to organizations |
US8990105B1 (en) * | 2010-01-07 | 2015-03-24 | Magnetic Media Online, Inc. | Systems, methods, and media for targeting advertisements based on user search information |
US8392252B2 (en) | 2010-03-03 | 2013-03-05 | Scientific Targeting Llc | Scientific targeting for advertisement and content selection, distribution, and creation |
US8909629B2 (en) | 2010-03-22 | 2014-12-09 | Google Inc. | Personalized location tags |
US8949834B2 (en) | 2010-04-07 | 2015-02-03 | Yahoo! Inc. | Modeling and scheduling asynchronous incremental workflows |
US8380716B2 (en) | 2010-05-13 | 2013-02-19 | Jan Mirus | Mind map with data feed linkage and social network interaction |
US8671423B1 (en) | 2010-06-07 | 2014-03-11 | Purplecomm Inc. | Method for monitoring and controlling viewing preferences of a user |
US20110320715A1 (en) | 2010-06-23 | 2011-12-29 | Microsoft Corporation | Identifying trending content items using content item histograms |
US8799260B2 (en) | 2010-12-17 | 2014-08-05 | Yahoo! Inc. | Method and system for generating web pages for topics unassociated with a dominant URL |
US10445782B2 (en) | 2010-12-22 | 2019-10-15 | Facebook, Inc. | Expanded tracking and advertising targeting of social networking users |
US8583786B2 (en) | 2011-01-21 | 2013-11-12 | Verizon Patent And Licensing Inc. | Systems and methods for rating a content based on trends |
US8700543B2 (en) | 2011-02-12 | 2014-04-15 | Red Contexto Ltd. | Web page analysis system for computerized derivation of webpage audience characteristics |
US8543454B2 (en) | 2011-02-18 | 2013-09-24 | Bluefin Labs, Inc. | Generating audience response metrics and ratings from social interest in time-based media |
US9836455B2 (en) | 2011-02-23 | 2017-12-05 | New York University | Apparatus, method and computer-accessible medium for explaining classifications of documents |
US8566152B1 (en) | 2011-06-22 | 2013-10-22 | Google Inc. | Delivering content to users based on advertisement interaction type |
WO2013010104A1 (en) | 2011-07-13 | 2013-01-17 | Bluefin Labs, Inc. | Topic and time based media affinity estimation |
US20130066677A1 (en) | 2011-09-12 | 2013-03-14 | Scott William Killoh | System and method for media and commerce management |
US8700766B2 (en) | 2011-09-13 | 2014-04-15 | Google Inc. | System and method for indirectly classifying a computer based on usage |
US10217117B2 (en) | 2011-09-15 | 2019-02-26 | Stephan HEATH | System and method for social networking interactions using online consumer browsing behavior, buying patterns, advertisements and affiliate advertising, for promotions, online coupons, mobile services, products, goods and services, entertainment and auctions, with geospatial mapping technology |
US9152970B1 (en) | 2011-09-27 | 2015-10-06 | Amazon Technologies, Inc. | Remote co-browsing session management |
US9588580B2 (en) | 2011-09-30 | 2017-03-07 | Dejoto Technologies Llc | System and method for single domain and multi-domain decision aid for product on the web |
US9177142B2 (en) | 2011-10-14 | 2015-11-03 | Trustwave Holdings, Inc. | Identification of electronic documents that are likely to contain embedded malware |
US20130124193A1 (en) | 2011-11-15 | 2013-05-16 | Business Objects Software Limited | System and Method Implementing a Text Analysis Service |
US8976955B2 (en) | 2011-11-28 | 2015-03-10 | Nice-Systems Ltd. | System and method for tracking web interactions with real time analytics |
US9128896B2 (en) | 2011-12-20 | 2015-09-08 | Bitly, Inc. | Systems and methods for identifying phrases in digital content that are trending |
US9202227B2 (en) | 2012-02-07 | 2015-12-01 | 6 Sense Insights, Inc. | Sales prediction systems and methods |
US8873833B2 (en) | 2012-02-17 | 2014-10-28 | Sony Corporation | System and method for effectively performing a scene representation procedure |
US9514461B2 (en) | 2012-02-29 | 2016-12-06 | Adobe Systems Incorporated | Systems and methods for analysis of content items |
US20130297338A1 (en) | 2012-05-07 | 2013-11-07 | Ingroove, Inc. | Method for Evaluating the Health of a Website |
US8856924B2 (en) | 2012-08-07 | 2014-10-07 | Cloudflare, Inc. | Mitigating a denial-of-service attack in a cloud-based proxy service |
WO2014054052A2 (en) | 2012-10-01 | 2014-04-10 | Parag Kulkarni | Context based co-operative learning system and method for representing thematic relationships |
US9141722B2 (en) | 2012-10-02 | 2015-09-22 | Google Inc. | Access to network content |
US10133812B2 (en) | 2012-12-05 | 2018-11-20 | Grapevine6 Inc. | System and method for finding and prioritizing content based on user specific interest profiles |
US20140236669A1 (en) | 2013-02-18 | 2014-08-21 | PlaceIQ, Inc. | Apparatus and Method for Identifying and Employing Visitation Rates |
GB2509766A (en) | 2013-01-14 | 2014-07-16 | Wonga Technology Ltd | Website analysis |
US9449002B2 (en) | 2013-01-16 | 2016-09-20 | Althea Systems and Software Pvt. Ltd | System and method to retrieve relevant multimedia content for a trending topic |
US9706008B2 (en) | 2013-03-15 | 2017-07-11 | Excalibur Ip, Llc | Method and system for efficient matching of user profiles with audience segments |
US10491694B2 (en) | 2013-03-15 | 2019-11-26 | Oath Inc. | Method and system for measuring user engagement using click/skip in content stream using a probability model |
US20140278916A1 (en) | 2013-03-15 | 2014-09-18 | Adchemy, Inc. | Building Product-Based Advertising Campaigns |
US20140280890A1 (en) | 2013-03-15 | 2014-09-18 | Yahoo! Inc. | Method and system for measuring user engagement using scroll dwell time |
US20150074131A1 (en) | 2013-09-09 | 2015-03-12 | Mobitv, Inc. | Leveraging social trends to identify relevant content |
US10430806B2 (en) | 2013-10-15 | 2019-10-01 | Adobe Inc. | Input/output interface for contextual analysis engine |
US9471671B1 (en) | 2013-12-18 | 2016-10-18 | Google Inc. | Identifying and/or recommending relevant media content |
US20150309965A1 (en) | 2014-04-28 | 2015-10-29 | Elwha Llc | Methods, systems, and devices for outcome prediction of text submission to network based on corpora analysis |
US9779144B1 (en) | 2014-08-02 | 2017-10-03 | Google Inc. | Identifying a level of relevancy of a keyword cluster related to an event category for a given time period relative to the event |
US20190050874A1 (en) | 2014-09-26 | 2019-02-14 | Bombora, Inc. | Associating ip addresses with locations where users access content |
US9940634B1 (en) | 2014-09-26 | 2018-04-10 | Bombora, Inc. | Content consumption monitor |
US10402465B1 (en) * | 2014-09-26 | 2019-09-03 | Amazon Technologies, Inc. | Content authority ranking using browsing behavior |
US20180365710A1 (en) | 2014-09-26 | 2018-12-20 | Bombora, Inc. | Website interest detector |
US20170364931A1 (en) | 2014-09-26 | 2017-12-21 | Bombora, Inc. | Distributed model optimizer for content consumption |
US20160132906A1 (en) | 2014-09-26 | 2016-05-12 | Bombora, Inc. | Surge detector for content consumption |
US9514368B2 (en) | 2014-11-14 | 2016-12-06 | Telecommunications Systems, Inc. | Contextual information of visual media |
US9667733B2 (en) | 2015-03-04 | 2017-05-30 | Adobe Systems Incorporated | Determining relevant content for keyword extraction |
US20160371725A1 (en) | 2015-06-18 | 2016-12-22 | Duy Nguyen | Campaign optimization system |
US9521157B1 (en) | 2015-06-24 | 2016-12-13 | Bank Of America Corporation | Identifying and assessing malicious resources |
EP3398146B1 (en) | 2015-12-28 | 2021-12-01 | Bombora, Inc. | Surge detector for content consumption |
US10839415B2 (en) | 2016-10-10 | 2020-11-17 | International Business Machines Corporation | Automated offer generation responsive to behavior attribute |
US10642889B2 (en) | 2017-02-20 | 2020-05-05 | Gong I.O Ltd. | Unsupervised automated topic detection, segmentation and labeling of conversations |
US20190294642A1 (en) | 2017-08-24 | 2019-09-26 | Bombora, Inc. | Website fingerprinting |
-
2021
- 2021-03-01 US US17/189,073 patent/US11589083B2/en active Active
-
2023
- 2023-02-13 US US18/168,440 patent/US20230232052A1/en not_active Abandoned
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110258049A1 (en) * | 2005-09-14 | 2011-10-20 | Jorey Ramer | Integrated Advertising System |
US20090092299A1 (en) * | 2007-10-03 | 2009-04-09 | Siemens Medical Solutions Usa, Inc. | System and Method for Joint Classification Using Feature Space Cluster Labels |
US20130132468A1 (en) * | 2011-11-22 | 2013-05-23 | Olurotimi Azeez | Discovering, organizing, accessing and sharing information in a cloud environment |
US20130166485A1 (en) * | 2011-12-23 | 2013-06-27 | Florian Hoffmann | Automated observational decision tree classifier |
US20140188830A1 (en) * | 2012-12-27 | 2014-07-03 | Sas Institute Inc. | Social Community Identification for Automatic Document Classification |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210357531A1 (en) * | 2019-02-05 | 2021-11-18 | Anagog Ltd. | Privacy preserving location tracking |
US11966496B2 (en) * | 2019-02-05 | 2024-04-23 | Anagog Ltd. | Privacy preserving location tracking |
US20220311528A1 (en) * | 2021-03-23 | 2022-09-29 | Sling TV L.L.C. | Systems and methods for unifying local channels with over-the-top services |
Also Published As
Publication number | Publication date |
---|---|
US20220279220A1 (en) | 2022-09-01 |
US11589083B2 (en) | 2023-02-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11631015B2 (en) | Machine learning techniques for internet protocol address to domain name resolution systems | |
US20220188700A1 (en) | Distributed machine learning hyperparameter optimization | |
US11556942B2 (en) | Content consumption monitor | |
US20230232052A1 (en) | Machine learning techniques for detecting surges in content consumption | |
US20220188699A1 (en) | Machine learning techniques for web resource fingerprinting | |
US20210365445A1 (en) | Technologies for collecting, managing, and providing contact tracing information for infectious disease response and mitigation | |
Capdevila et al. | GeoSRS: A hybrid social recommender system for geolocated data | |
US20230012803A1 (en) | Systems and Methods for Analyzing a List of Items Using Machine Learning Models | |
US20120296991A1 (en) | Adaptive system architecture for identifying popular topics from messages | |
US20140129331A1 (en) | System and method for predicting momentum of activities of a targeted audience for automatically optimizing placement of promotional items or content in a network environment | |
US10411985B1 (en) | Network traffic monitoring for virtual machines | |
WO2014130843A1 (en) | System and method for revealing correlations between data streams | |
AU2017348460A1 (en) | Systems and methods for monitoring and analyzing computer and network activity | |
US20190019222A1 (en) | User/group servicing based on deep network analysis | |
US20220188698A1 (en) | Machine learning techniques for web resource interest detection | |
US20220230078A1 (en) | Machine learning techniques for associating network addresses with information object access locations | |
US20220358240A1 (en) | Adaptive data privacy platform | |
US20190180325A1 (en) | Systems and methods for ingesting and processing data in a data processing environment | |
Dong et al. | PPM: A privacy prediction model for online social networks | |
EP3971811A1 (en) | Privacy supporting messaging systems and methods | |
US10003620B2 (en) | Collaborative analytics with edge devices | |
JP6683681B2 (en) | Determining the contribution of various user interactions to conversions | |
CA2864127A1 (en) | Systems and methods for recommending advertisement placement based on in network and cross network online activity analysis | |
JP2023524362A (en) | pattern-based classification | |
WO2022216753A1 (en) | Distributed machine learning hyperparameter optimization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
AS | Assignment |
Owner name: RUNWAY GROWTH CREDIT FUND INC., ILLINOIS Free format text: SECURITY INTEREST;ASSIGNOR:BOMBORA, INC.;REEL/FRAME:065955/0817 Effective date: 20210331 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |