WO2021181900A1 - Target user feature extraction method, target user feature extraction system, and target user feature extraction server - Google Patents

Target user feature extraction method, target user feature extraction system, and target user feature extraction server Download PDF

Info

Publication number
WO2021181900A1
WO2021181900A1 PCT/JP2021/001917 JP2021001917W WO2021181900A1 WO 2021181900 A1 WO2021181900 A1 WO 2021181900A1 JP 2021001917 W JP2021001917 W JP 2021001917W WO 2021181900 A1 WO2021181900 A1 WO 2021181900A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
user
target
poster
feature extraction
Prior art date
Application number
PCT/JP2021/001917
Other languages
French (fr)
Japanese (ja)
Inventor
江里子 佐藤
林 秀樹
Original Assignee
株式会社日立ハイテク
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社日立ハイテク filed Critical 株式会社日立ハイテク
Priority to CN202180008023.5A priority Critical patent/CN114902196A/en
Priority to DE112021000337.2T priority patent/DE112021000337T5/en
Publication of WO2021181900A1 publication Critical patent/WO2021181900A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0255Targeted advertisements based on user history

Definitions

  • the present invention relates to a technique for extracting a specific user feature from the history information of a user who browses a website.
  • Patent Documents 1 and 2 are known as a web analysis technique for analyzing the preference of a user who accesses a website.
  • Patent Document 1 discloses a technique for estimating a recommendation candidate item for each user by referring to the user information storage unit and the user history information storage unit. Further, in Patent Document 2, the user's preference distribution is analyzed from the selection history of the item selected by the user, a recommendation index close to the center of the favorable distribution and away from the preference distribution shape is calculated, and the calculated recommendation is obtained. A technique for displaying recommended items based on an index is disclosed.
  • a poster who provides information such as content to a website may intend to create a new business or dig up an existing user depending on the user who accesses the provided content.
  • user characteristics the characteristics of the user targeted by the poster (hereinafter referred to as user characteristics)
  • the items to be emphasized depend on the preference and target (acquisition target) of the poster.
  • Patent Document 1 of the above-mentioned conventional example after determining a user's preference, a plan for presenting information to the user is calculated from the user's behavior history.
  • the preference determination unit disclosed in Patent Document 1 only uses the user's attribute information and history information to determine the degree of similarity, and the intention (or preference) regarding the target on the side of providing the item (content). Was not taken into account.
  • Patent Document 2 of the conventional example a user's preference is analyzed, a recommendation index away from the preference distribution shape is calculated, and an unexpected item is provided.
  • this Patent Document 2 does not consider the target intended by the provider of the item.
  • an object of the present invention is to extract user characteristics that a poster who provides information to a website wants to acquire from the history of users who have accessed the website.
  • the present invention is a target user feature extraction method in which a computer having a processor and a memory extracts user features targeted by a poster from history information of accessing the contents of a web server, wherein the computer is the web server.
  • the preference acquisition step of accepting the poster to be extracted and acquiring the information of the user targeted by the poster as the target type, and the item of the data to be extracted from the target type of the poster by the computer.
  • a target calculation step for calculating a range of values of the item includes an access feature extraction step of calculating an access feature amount based on a range of values of the item from the target data and the poster data.
  • the present invention makes it possible to extract user characteristics according to the preference of the poster who provides the content from the history information of the user who has accessed the website.
  • the present invention makes it possible to extract new user characteristics that are different from the intention of the poster, and it is possible to create a new business.
  • FIG. 1 is a block diagram showing an embodiment of the present invention and showing an example of the configuration of a target user feature extraction system.
  • the target user feature extraction system supplies information to the web server 200 that manages the website including the content 210 and the advertisement 220, the user terminals 100-1 to 100-3 that access the website information, and the web server 200.
  • the users (target types) that the posters who provide information from the posting terminals 300-1 to 300-3 and the posting terminals 300-1 to 300-3 want to acquire are extracted from the access history (log 230) of the web server 200.
  • the target user feature extraction server 1 is included.
  • the code of the user terminals 100-1 to 100-3 the code "100" is used, omitting the "-" and subsequent parts when not individually specified. Similar codes are used for the codes of other components.
  • Posting terminals 300-1 to 300-3 are operated by contributors A, B, and C in different industries, and each contributor A to C also serves as an advertiser to provide content 210 and advertisement 220.
  • the poster who operates the posting terminal 300 serves as both the provider of the content 210 and the advertiser, but the present invention is not limited to this, and the poster and the advertiser of the content 210 are different. May be good.
  • the user terminals 100-1 to 100-3 are operated by users in different industries a, b, and c, and browse the contents 210 and the advertisement 220 of the web server 200.
  • the web server 200 is composed of a computer, and transmits the access history (history information) of the user terminal 100, the information of the poster who uses the posting terminal 300, and the attribute data of the content 210 to the target user feature extraction server 1.
  • the web server 200 may be connected to a database server, an application server, or the like to build a website.
  • the target user feature extraction server 1 is a history of users who have accessed the web server 200 from the users (users of the user terminal 100) that the posters A to C who provide information to the website provided by the web server 200 want to acquire. Extract user characteristics from session data). Further, the target user feature extraction server 1 analyzes the content (page) 210 provided by the posting terminal 300 and extracts it as a page feature.
  • the target user feature extraction server 1 collects the access history of the user terminal 100 at a predetermined cycle (for example, one month), and extracts the access feature amount including the user feature and the page feature for the poster to be extracted. Notify the posting terminal 300.
  • the posting terminal 300 notifies the target user feature extraction server 1 in advance of the user information that the poster wants to acquire as the target type.
  • the poster may notify the web server 200 of the target type from the posting terminal 300, and the target user feature extraction server 1 may acquire the target type from the web server 200.
  • FIG. 2 is a block diagram showing an example of the configuration of the target user feature extraction server 1.
  • the target user feature extraction server 1 is a computer including a processor 11, a memory 12, a storage device 13, an input device 14, an output device 15, and a communication device 16.
  • the communication device 16 is connected to the network 400 and communicates with the web server 200 and the posting terminal 300.
  • the output device 15 is composed of a display or the like.
  • the input device 14 is composed of a keyboard, a mouse, or a touch panel.
  • the memory 12 includes a processing target selection unit 21, a session feature calculation unit 22, a target calculation unit 23, an access feature extraction unit 27, a target determination item processing unit 28, a data processing unit 30, and a learning unit 31. It is loaded as a program and executed by the processor 11.
  • the processor 11 operates as a functional unit that provides a predetermined function by executing processing according to the program of each functional unit.
  • the processor 11 functions as the session feature calculation unit 22 by executing the process according to the session feature extraction program. The same applies to other programs.
  • the processor 11 also operates as a functional unit that provides each function of a plurality of processes executed by each program.
  • a computer and a computer system are devices and systems including these functional parts.
  • session data 41 user attribute data 42, page attribute data 43, poster attribute data 44, poster target data 45, and range conversion information 46 are used as data used by each of the above programs. Is stored.
  • the session data 41 shows the access history of the user terminal 100 that has accessed the content 210 (or the advertisement 220) among the logs 230 collected by the web server 200.
  • the user attribute data 42 indicates the attributes of the user who uses the user terminal 100.
  • the page attribute data 43 indicates the attributes of the content 210.
  • the poster attribute data 44 indicates the attribute of the poster.
  • the poster target data 45 the user group (target type) that the posters A to C want to acquire is set as qualitative information.
  • the poster target data 45 can also set a range (or threshold value) of items and values.
  • the range conversion information 46 an item of analysis target data that specifies a user group for each target type and a range (or threshold value) of the value of the item are set. The details of each data will be described later.
  • the processing target selection unit 21 receives the period of the session data 41 used for analysis and the poster to be analyzed from the input device 14 or the like in the access history (session data 41) of the user terminal 100 acquired from the web server 200. It should be noted that the analysis may be performed on the targets of all the contributors by designating the period of the session data 441 without designating the contributors.
  • the target calculation unit 23 accepts the poster to be analyzed, acquires the range of user characteristics that the poster wants to acquire from the poster target data 45 as target information, and analyzes the session data based on the target information. Determine the range of values.
  • the items and ranges for analyzing session data are set according to the target information (acquisition target) and preference for each poster A to C.
  • the target information acquisition target
  • preference for each poster A to C preference for each poster A to C.
  • a web server for example, as an item for determining the target of a poster, a web server.
  • the range can be specified by the range of these numerical values, the threshold value, and the like.
  • the item and range for analyzing the session data 41 may be determined by the range calculation unit 26 with reference to the range conversion information 46, or the target determination model 25 may calculate the item and range.
  • the range conversion unit 24 causes the range calculation unit 26 to refer to the range conversion information 46 to determine the item and the range. Further, when the range conversion information 46 corresponding to the target information does not exist, the range conversion unit 24 has the session data 41, the user attribute data 42, the page attribute data 43, the poster attribute data 44, and the target for the specified period. Information is input to the target determination model 25 to generate items and ranges to be analyzed.
  • the session feature calculation unit 22 acquires the session data 41 of the period accepted by the processing target selection unit 21, the user attribute data 42 of the user, the page attribute data 43, and the poster attribute data 44 included in the session data 41, and targets the target.
  • the data of the items determined by the calculation unit 23 is generated as the extraction target data indicating the characteristics of the session. If the determined item exists in the session data 41, the session data 41 for the specified period is set as the extraction target data 50.
  • the session feature calculation unit 22 uses the target determination item processing unit 28 and the data processing unit 30 to generate the extraction target data 50 corresponding to the determined item, as will be described later.
  • the target determination item processing unit 28 uses the similarity calculation unit 29 according to the target item.
  • the user terminal 100 sets the web server 200 for each page of the content 210 accessed by the user terminal 100, for each tag of the page attribute data 43, or for each poster who provides the content 210.
  • the visit history can be calculated as data showing the characteristics of the session.
  • the access feature extraction unit 27 receives the extraction target data from the session feature calculation unit 22 and the items and ranges from the target calculation unit 23, and extracts the user features and the features (page features) of the accessed content 210.
  • the access feature extraction unit 27 calculates the user feature amount as the user feature based on the extraction target data from the session feature calculation unit 22, the item from the target calculation unit 23, and the range of the value of the item. Further, the access feature extraction unit 27 receives the attribute data of the poster (poster attribute data 44) and the attribute data of the content 210 provided by the poster to the web server 200 (page attribute data 43), and is accessed by the user. The feature amount of the content 210 related to the above is extracted as a page feature.
  • the user features and page features extracted by the access feature extraction unit 27 are notified to the posting terminal 300.
  • the access feature extraction unit 27 can display the extracted user features and page features on the output device 15.
  • the user features extracted by the access feature extraction unit 27 may include, for example, the ratio of the type of industry of the user who accessed the content 210 of the poster to be extracted, the characteristics of the session (the number of repeats), and the like as the feature amount. can.
  • the page feature extracted by the access feature extraction unit 27 can include, for example, the ratio of tags of the accessed content 210, the average staying time of each page, and the like as the feature amount.
  • the learning unit 31 inputs session data 41, user attribute data 42, poster attribute data 44, page attribute data 43, and poster target data 45, performs machine learning, and generates a target determination model 25.
  • the target determination model 25 is generated in advance before the user feature 51 and the page feature 52 are extracted.
  • FIG. 4 is a diagram showing an example of session data 41.
  • the session data 41 is historical information collected by the target user feature extraction server 1 from the web server 200 at a predetermined cycle or the like.
  • the session data 41 is a table that includes the ID 411, the access time 412, the visit page 413, the number of repeats 414, and the departure time 415 in one record.
  • ID411 stores the identifier of the user terminal 100.
  • ID411 is a value given by the web server 200, and may be a unique value in the target user feature extraction system.
  • the access time 412 stores the date and time when the user terminal 100 started accessing the page.
  • the visit page 413 stores the URL of the content 210 accessed by the user terminal 100.
  • the repeat count 414 stores the cumulative number of times the page has been accessed.
  • the withdrawal time 415 stores the time when the user terminal 100 finishes browsing the page.
  • FIG. 5 is a diagram showing an example of user attribute data 42.
  • the user attribute data 42 is a table set by the target user feature extraction server 1.
  • the user attribute data 42 is a table that includes ID 421, IP 422, industry 423, and sales 424 in one record.
  • ID 421 stores the identifier of the user terminal 100.
  • ID 421 is the same value as ID 411 of the session data 41.
  • IP422 stores the IP address of the user terminal 100.
  • the industry 423 stores the industry of the user's company (or group) that uses the user terminal 100. Since the industry 423 can identify the company to which the user belongs from the IP address of the user terminal 100, the industry may be determined from the information of the company. Sales 424 stores the sales of the company to which the user belongs.
  • the type of business and sales of the user who uses the user terminal 100 may be set by the administrator of the target user feature extraction server 1 or the like, or may be set from a preset database or the like.
  • FIG. 6 is a diagram showing an example of the extraction target data 50.
  • the extraction target data 50 is intermediate data calculated by the session feature calculation unit 22.
  • the number of views and the average staying time are output from the target calculation unit 23 as the items of the extraction target data for specifying the user group.
  • the session feature calculation unit 22 aggregates the pages viewed by each user for each contributor from the session data 41 within the period accepted by the processing target selection unit 21, and the user attribute data 42.
  • the session feature calculation unit 22 aggregates the pages viewed by each user for each contributor from the session data 41 within the period accepted by the processing target selection unit 21, and the user attribute data 42.
  • the extraction target data 50 is a table that includes ID 501, contributor 502, number of views 503, average stay time 504, and industry 505 in one record.
  • the ID 501 stores the identifier of the user terminal 100.
  • ID501 is the same value as ID411 of the session data 41.
  • the contributor 502 stores the identifier of the contributor of the content 210 viewed by the user of the ID 501.
  • the identifier of the poster of the content 210 is information preset for each page constituting the content 210, and is acquired from the page attribute data 43 transmitted from the web server 200.
  • the number of views 503 stores the total number of pages provided by the poster 502 viewed by the user of the ID 501.
  • the average stay time 504 stores the average time that the user of the ID 501 stays (views) on the page provided by the poster 502.
  • the industry 505 stores the industry 423 of the user attribute data 42.
  • FIG. 7 is a diagram showing an example of the range conversion information 46.
  • the range conversion information 46 is a table for converting the qualitative information of the poster target data 45 into a range of items and values to be extracted.
  • the range conversion information 46 is information in which the items of the extraction target data 50 calculated from the session data 41, the user attribute data 42, and the like and the data range 462 are preset for each target type 461 that classifies the user group that the poster wants to acquire. Is.
  • the target type 461 is a value of the target information of the poster target data 45.
  • target type 461 As an example of the target type 461, an example in which "new”, “existing”, “people who subscribe over time”, “repeater”, “good customer”, and “people who are interested in cutting” are set is shown. There is.
  • the "new" target type 461 indicates that the poster provides information on the content 210 and the advertisement 220 to the website of the web server 200 for the purpose of acquiring new users.
  • a range 462 is set in advance in which a user who has viewed the content 210 of the corresponding poster 50 times or less is regarded as a "new" user.
  • the "existing" target type 461 indicates that the poster provides information to the web server 200 for the purpose of digging up existing users.
  • a range 462 is set in advance in which a user whose content 210 is viewed by the corresponding contributor exceeds 50 as an "existing" user.
  • the target type 461 of the "person who subscribes over time” indicates that the content 210 is provided to the web server 200 for the purpose of acquiring users who browse the content 210 of the poster over time.
  • a range 462 for determining a user whose average staying time 504 of the content 210 of the corresponding poster is 500 seconds or more per page as the corresponding user is set in advance.
  • the target type 461 of the "repeater” indicates that information is provided to the web server 200 for the purpose of acquiring users who repeatedly browse the content 210 of the poster.
  • a range 462 for determining a user whose content 210 of the corresponding poster has a repeat number of 414 of 2 or more and a visit interval of 1 week or less as the corresponding user is set in advance.
  • the target type 461 of the "excellent customer” is preset with a range 462 for determining a user who accesses the content 210 of the poster and whose sales 424 of the company to which the user belongs is 1 billion yen or more as the corresponding user. NS.
  • the target type 461 of the "person who is interested in cutting” is preset with a range 462 for determining the user who has accessed the page including the "cutting" tag in the content 210 of the poster as the corresponding user.
  • the range conversion unit 24 adds the session data 41 and the user attribute data 42 to the target determination model 25 as described later. And page attribute data 43 and poster attribute data 44 are input to generate items and ranges.
  • the page attribute data 43 is a table that includes a URL, a tag indicating the type of the content 210, and an identifier of the poster who provides the content 210 for each page of the content 210.
  • the page attribute data 43 may include static information such as words used in the content 210, or may include features of sentences and articles calculated by word2vec or the like.
  • the poster target data 45 is set with the poster identifier and the target information selected in advance by the poster.
  • the target information of the poster target data 45 corresponds to the value of the target type 461 of the range conversion information 46 described above, but a value not included in the target type 461 of the range conversion information 46 can be set.
  • the poster target data 45 can be set with information including an item and a range of values in addition to qualitative information.
  • the poster attribute data 44 stores the identifier of the poster, the type of business of the poster, and the department to which the poster belongs.
  • FIG. 3 is a diagram showing an outline of processing performed by the target user feature extraction server 1. This process is started based on the command of the user of the target user feature extraction server 1.
  • the processing target selection unit 21 accepts the extraction target period and the poster. As described above, when the contributor is not input, all the contributors of the web server 200 are extracted.
  • the target calculation unit 23 receives posters from the processing target selection unit 21, acquires the target type for each poster from the poster target data 45, and corresponds to the target information from the range conversion information 46 or the target determination model 25. Determine the item and value range to be used.
  • the target calculation unit 23 determines the item and range of the extraction target data 50 for each contributor using the range conversion unit 24, outputs the item to the session feature calculation unit 22, and outputs the range to the access feature extraction unit 27. do.
  • the range conversion unit 24 transfers the session data 41, the user attribute data 42, the page attribute data 43, and the target determination model 25 to the target determination model 25.
  • the poster attribute data 44 is input to determine the item and range to be extracted.
  • the range conversion unit 24 When the target type 461 corresponding to the target information does not exist in the range conversion information 46, the range conversion unit 24 generates an item and a range to be extracted by the target determination model 25, thereby generating an access feature extraction unit. 27 can extract user features that match the target information.
  • the target determination model 25 is a model generated in advance by machine learning.
  • the learning unit 31 of the target user feature extraction server 1 generates the target determination model 25 by machine learning the poster attribute data 44 and the page attribute data 43 in the session data 41 and the user attribute data 42 of the user terminal 100.
  • the session feature calculation unit 22 acquires the session data 41 within the period received from the processing target selection unit 21, and acquires the user attribute data 42 corresponding to the ID 411 of the session data 41.
  • the session feature calculation unit 22 receives items from the target calculation unit 23 and generates extraction target data 50 including the items specified from the session data 41 and the user attribute data 42 within the specified period.
  • the item of the extraction target data 50 is determined according to the content of the range 462 corresponding to the target type 461 of the range conversion information 46 or the output of the target determination model 25.
  • the generated extraction target data 50 is output to the access feature extraction unit 27.
  • the session feature calculation unit 22 may generate the extraction target data 50 for each poster to be extracted, or may generate the extraction target data 50 including all the items of the poster to be extracted.
  • the access feature extraction unit 27 receives a range of values to be extracted from the target calculation unit 23, and receives the extraction target data 50 from the session feature calculation unit 22.
  • the access feature extraction unit 27 applies a well-known or known analysis technique to extract user features corresponding to the range 462 specified from the extraction target data 50 for each contributor, and sets the user feature 51 as the feature amount of the session. Output.
  • the access feature extraction unit 27 uses the target type of the poster as the explanatory variable and the range of the number of views as the objective variable, and estimates the user features included in the target information. do.
  • the access feature extraction unit 27 acquires the poster attribute data 44 and the page attribute data 43, extracts the page accessed by the user included in the extraction target data 50, and indicates the page feature 52 indicating the feature amount of the session. Is output as.
  • the access feature extraction unit 27 can also estimate the extraction of the page feature 52 by machine learning in the same manner as described above.
  • the access feature extraction unit 27 is not limited to the machine learning model, and may apply statistical values such as an average value and a median value.
  • FIG. 23 is a diagram showing an example of the user feature 51 extracted by the access feature extraction unit 27 and the extraction result screen 600 of the page feature 52. Further, FIG. 24 is a diagram showing an example of session data 41 analyzed by the access feature extraction unit 27.
  • users 1 to 3 using the user terminal 100 access pages A1 and A2 of poster A and page B1 of poster B, and page features of page D1 of poster D are also pages A1 and A2.
  • An example similar to B1 is shown.
  • FIG. 23 shows an example in which the user feature 51 of the user corresponding to the target type 461 of the poster A and the extraction result of the page feature 52 are displayed as extraction targets.
  • the target type 461 of the contributor A shows an example in which users 1 to 3 shown in FIG. 24 correspond.
  • the user characteristic 51 it is shown that the metal industry accounts for 67% and the material manufacturer accounts for 33% in the industries of users 1 to 3, and the access of users 1 to 3 is extracted as a feature that the number of repeats is 414.
  • the page feature 52 accessed by the users 1 to 3 includes metal and processing as a tag of the page attribute data 43, and it is displayed that the average stay time 504 is long as a feature of the session data 41.
  • the target user feature extraction server 1 is obtained from the ID 411 for each session, the visit page 413, the time information (412, 415), the industry of the user attribute data 42, the tag of the page attribute data 43, and the poster. It is possible to extract user features that match the target information (poster's preference) from the extraction target data 50.
  • the target information of the poster is, for example, a qualitative value of "targeting a new customer", and the item and range obtained by quantitatively converting this target information are "the number of views to the article of the poster is 30".
  • One of the session features (user features) to be extracted by the access feature extraction unit 27 is the data of the number of visits (views) of the user to the content 210 of the poster for each industry.
  • the other is a feature indicating the distance between the attributes of the user's industry, and for this, the result of calculating the similarity from the number of visits to the tag of the user's industry and the page attribute data 43 can be used.
  • the data processing unit 30 calculates the total number of page visits for each user for each poster's content 210 and for each attribute (industry 423) associated with the user's ID 411. Further, regarding the distance, the data processing unit 30 calculates the distance related to the feature amount by using, for example, a method of calculating the similarity such as a multidimensional scaling method, and constitutes the extraction target data 50 with these data.
  • the access feature extraction unit 27 sets the user's industry and the number of visits as the user's characteristics as the characteristics of the session that matches the poster's preference of "targeting new customers". It can be presented as 51. Further, the access feature extraction unit 27 extracts the features of the page visited by the user's industry when the session data is narrowed down by the link destination of the user's industry included in the session features and the content 210 of the poster. It can be output as page feature 52.
  • the distance of the feature amount (similarity) of the industry among a plurality of users who have accessed the visit page 413 is calculated by using the industry 423 of the user attribute data 42.
  • the access feature extraction unit 27 can present the content 210 as the user feature 51 as a group of users according to the distance for each poster.
  • FIG. 8 is a flowchart showing an example of processing performed by the session feature calculation unit 22 shown in FIG.
  • the session feature calculation unit 22 receives a period from the processing target selection unit 21 and receives an item from the target calculation unit 23, the session feature calculation unit 22 performs the following processing.
  • the session feature calculation unit 22 acquires the data within the received period from the session data 41 (S1). Next, the session feature calculation unit 22 acquires the user attribute data 42 of the user (user terminal 100) included in the session data 41 within the designated period (S2).
  • the session feature calculation unit 22 combines the session data 41 acquired in step S1 with the user attribute data 42 in which the user IDs 411 and 421 match to generate the combined data (S3).
  • the session feature calculation unit 22 determines whether or not the item received from the target calculation unit 23 is included in the combined data generated in step S3 (S4). When the combined data includes all the items to be extracted, the session feature calculation unit 22 outputs the combined data as the extraction target data 50 as it is. On the other hand, when the session feature calculation unit 22 does not include all the items to be extracted in the combined data, the session feature calculation unit 22 proceeds to step S5 and generates data of the received items from the combined data by the data processing unit 30.
  • the data processing unit 30 generates data of the items to be extracted determined by the target calculation unit 23 from the combined data for each user.
  • the data processing unit 30 calculates the difference between the departure time 415 and the access time 412 for the record in which the ID 411 of the session data 41 and the visit page 413 match, and averages the same visit page 413. Calculate the value as the average staying time. Further, the data processing unit 30 may specify the contributor (identifier) of each visit page 413 with reference to the page attribute data 43, and calculate the average staying time for each contributor.
  • the session feature calculation unit 22 outputs the data generated for each of the above items in step S6 to the access feature extraction unit 27 as the extraction target data 50.
  • the session feature calculation unit 22 calculates the data of the items used for determining the target information from the session data 41 and the user attribute data 42 within the designated period, and outputs the data as the extraction target data 50.
  • FIG. 9 is a flowchart showing an example of processing performed by the target calculation unit 23 shown in FIG.
  • the target calculation unit 23 receives a poster from the processing target selection unit 21 and starts the following processing.
  • the target calculation unit 23 acquires target information from the poster target data 45 for the accepted poster (S11). The target calculation unit 23 determines whether or not the acquired target information is information including an item and a range (or a threshold value) of a value (S12). If the item and range are included, the process proceeds to step S14, and if not, the process proceeds to step S13.
  • the target information is qualitative information.
  • the target calculation unit 23 uses the range conversion unit 24 to convert the qualitative information into items and ranges. Then, in step S14, the converted item and the range of values are output to the session feature calculation unit 22 and the access feature extraction unit 27.
  • FIG. 10 is a flowchart showing an example of processing performed by the range conversion unit 24 of the target calculation unit 23.
  • the target calculation unit 23 determines whether or not the range conversion information 46 corresponding to the target information exists (S21). If the range conversion information 46 exists, the process proceeds to step S22, and if the range conversion information 46 does not exist, the process proceeds to step S23.
  • the range conversion unit 24 refers to the range conversion information 46, acquires the range 462 from the target type 461 corresponding to the target information, and determines the range of items and values set in the range 462.
  • step S23 the range conversion unit 24 inputs the session data 41, the user attribute data 42, the page attribute data 43, and the poster attribute data 44 into the target determination model 25 in the target determination model 25, and the item and range to be extracted. To decide.
  • the range conversion information 46 or the target determination model 25 determines the items to be extracted and the range of values.
  • FIG. 11 is a diagram showing an example of the target determination item processing unit 28 performed by the range conversion unit 24 of the target calculation unit 23.
  • the range conversion unit 24 processes the user data 510 of the session data 41 and the user attribute data 42 by the target determination item processing unit 28 for each content 210 (page) of the poster.
  • the statistical processing described later is performed (S231).
  • the session data 41 is data within the period received from the processing target selection unit 21.
  • the range conversion unit 24 combines the page attribute data 43, the poster attribute data 44, and the poster data 520 including the poster target data 45 with the processing result of the target determination item processing (S232).
  • the page attribute data 43 uses the data corresponding to the visit page 413 included in the session data 41 within the period received from the processing target selection unit 21.
  • the data obtained by combining the target determination item processing result of the user data 510 and the poster data 520 is given to the target determination model 25 to determine the item to be extracted and the range of values.
  • FIG. 12 is a flowchart showing an example of processing performed by the target determination item processing unit 28. This process is executed in step S231 of FIG. 11 above.
  • the target determination item processing unit 28 acquires the user data 510 shown in FIG. 11 (S32). The target determination item processing unit 28 determines whether or not to use the user attribute data 42 (S32). Whether or not to use the user attribute data 42 can be set in advance for each poster identifier in the poster target data 45, for example.
  • the target determination item processing unit 28 refers to the poster target data 45 and proceeds to step S33 when using the user attribute data 42, and proceeds to step S36 when not using it.
  • step S33 the target determination item processing unit 28 acquires the tags of the industry 423 of the user attribute data 42, the visit page 413 of the session data 41, and the page attribute data 43, and calculates the feature amount of the industry 423. Then, the target determination item processing unit 28 calculates the distance between the user's industry 423s in the space of the calculated feature amount by using a multidimensional scaling (MDS: Multi-Dimensional Scaling) or the like, and calculates this distance. Let it be similar.
  • MDS Multi-Dimensional Scaling
  • this process aggregates the number of views of the user attribute data 42 for each industry 423 for each tag of the page attribute data 43 for each contributor, and generates the number of views data 530.
  • the number of views data 530 in FIG. 19 is information obtained by calculating the total number of views of the user attribute data 42 for each industry 423 for each tag of the content 210.
  • the aggregated value of the number of views of each of the users of the industry a to the industry d is stored.
  • the number of views data 530 in FIG. 19 can express the amount of interest of the user's industry 423 for each tag of the poster A.
  • the target determination item processing unit 28 calculates the feature amount 1 and the feature amount 2 from the browsing number data 530 of FIG. 19 by using the multidimensional scaling analysis method, and as shown in FIG. 20, the feature amount 1 and the feature amount 2 Industry 423 is arranged in the space.
  • FIG. 20 is a map expressing the distance between the industry 423 represented by the features 1 and 2 as the degree of similarity. In the illustrated example, an example in which the degree of similarity is calculated from the number of views data 530 for the content 210 of the poster A is shown.
  • the target determination item processing unit 28 determines whether or not to utilize the characteristics of the session. Whether or not to use the session feature can be set in advance for each poster identifier in the poster target data 45, for example.
  • the target determination item processing unit 28 refers to the poster target data 45 and proceeds to step S35 when using the characteristics of the session, and ends the process when not using it.
  • step S35 the target determination item processing unit 28 performs statistical processing of the user data 510 and the poster data 520 for each page of the poster.
  • FIG. 21 is a diagram showing an example of statistical data 540 generated by statistical processing.
  • the statistical data is a diagram showing the result of the target determination item processing unit 28 totaling the number of views of the content 210 for each contributor by the user's industry 423.
  • the aggregated value of the number of views of each of the users of the industry a to the industry d is stored.
  • the statistical data 540 of FIG. 21 can express the amount of interest of the user's industry 423 for each contributor.
  • FIG. 22 is a diagram showing an example of a similarity map in which the result of statistical processing is added to the map of FIG. 21.
  • the size of the circle for each industry is proportional to the number of views of users in each industry for poster A.
  • the target determination item processing unit 28 outputs information that aggregates the distance between the industry 423 that has been statistically processed by the user attribute data 42 and the page attribute data 43 for each contributor.
  • step S36 when the user attribute data 42 is not used, the data processing unit 30 used by the session feature calculation unit 22 performs data processing such as the staying time for each visit page 413 to reach the target determination model 25. Output.
  • the target calculation unit 23 uses the target determination model 25
  • FIG. 13 is a diagram showing an example of selection data 550 that defines data for learning the target determination model 25 when the user attribute data 42 and the poster attribute data 44 are not used.
  • the selection data 550 is a table that includes the ID 5501, the target customer 5502, the average stay time 5503, and the number of views 5504 in one record.
  • the contributor's identifier is stored in ID5501.
  • the target customer 5502 stores the target type selected by each contributor.
  • the target type may be selected for each contributor from preset qualitative information.
  • the condition of the average stay time that the user stays (views) is stored in the page provided by the poster of the ID 5501.
  • the number of views 5504 stores the condition of the total number of views by the user on the page provided by the poster of the ID 5501.
  • the selection data 550 may be generated by the administrator of the target user feature extraction server 1 based on the target type received from the poster, or may be input from the posting terminal 300.
  • target determination model 25 there are two types of target types for constructing the target determination model 25: "new" that prioritizes new customers and "existing" that prioritizes existing customers. An example of using 5503 and 5504 views is shown.
  • FIG. 14 is a graph showing an example of selection data 550.
  • FIG. 14 the area of the user feature targeted by the posters A and C who selected "existing" is shown by a solid line, and the area of the user feature targeted by the posters B and D who selected "new” is shown by a broken line. Indicated.
  • the learning unit 31 generates learning data under the conditions set in the selection data 550 and gives it to the target determination model 25 for learning.
  • the learning data given to the target determination model 25 may be generated from the actual session data 41 and the user attribute data 42, but dummy data may be used.
  • the characteristics of the session of the target type do not have to be processed from the actual data, and some characteristics of the session are shown by dummy data, and the target type is selected for multiple contributors. It is possible to use the data in which the results of the trials are retained. Further, the area corresponding to the target type is obtained by converting the characteristics of the target type into the items of the selection data 550 in advance, and which item is selected for each target type may be output as shown in the graph of FIG. ..
  • FIG. 15A shows an example of a category table 560 that reflects the poster's target type (preference).
  • FIG. 15B shows an example of a condition table for setting an item and a range of values for each category.
  • the category table 560 of FIG. 15A includes the ID 5601, the target customer 5602, and the category number 5603 in one record.
  • the contributor's identifier is stored in ID5601.
  • the target customer 5602 stores the target type selected by each contributor.
  • the target type may be selected for each contributor from preset qualitative information.
  • the category number 5603 the number of the area indicating the characteristics of the session selected by each contributor is set.
  • the category number 5603 stores a number selected by the poster from the preset numbers.
  • the condition table 570 of FIG. 15B includes the target customer 5701, the average stay time 5702, and the number of views 5703 in one record.
  • the target customer 5701 stores a number corresponding to the category number 5603 of the category table 560.
  • the average stay time 5702 stores conditions related to the average stay time (viewed) by the user on the page provided by the poster.
  • the number of views 5703 stores a condition regarding the total number of pages viewed by the user provided by the poster.
  • the selection data 550 may be generated by the administrator of the target user feature extraction server 1 based on the target type received from the poster, or may be input from the posting terminal 300.
  • the area corresponding to the category number 5603 is limited by the average stay time 5702 and the number of views 5703 of the condition table 570, and is as shown in FIG. In FIG. 16, data having an average stay time of less than 100 hours is classified into category “2” regardless of the number of views 5703. Further, the data in which the number of views 5703 is less than 50 and the average staying time is less than 100 hours is classified into category "3", and the other areas are classified into category "1".
  • the data for learning may be generated by the category table 560 that stores the preference of the poster and the condition table 570 that determines the range of the data.
  • the user attribute data 42 and the poster attribute data 44 are used to be used between the poster and the user's industry in the same manner as in FIGS. 21 and 22 above.
  • An example using distance is shown below.
  • FIG. 17 is a diagram showing an example of selection data 580 that defines data for learning of the target determination model 25.
  • the selection data 580 is a table that includes the ID 5801, the target customer 5802, the industry 5803, the selected industry 5804, the distance 5805, and the number of views 5806 in one record.
  • the identifier of the poster is stored in the ID 5801.
  • the target customer 5802 stores the target type selected by each contributor. The target type may be selected for each contributor from preset qualitative information.
  • the industry 5803 stores the industry of the poster set in the poster attribute data 44.
  • the selected industry 5804 the industry of the user selected by the poster is stored.
  • the distance 5805 stores the distance of the degree of similarity between the poster and the user's industry.
  • the number of views 5806 stores the total number of pages viewed by the user provided by the poster of the ID 5801.
  • the selection data 580 may be generated based on the target type received from the poster by the administrator of the target user feature extraction server 1, or may be input from the posting terminal 300.
  • the size of the circle for each industry is proportional to the number of views of the users of each industry a to d for the content 210 of the poster A.
  • the similarity between industries is calculated from the session data 41, the user attribute data 42, and the poster attribute data 44, and the distance to the selected industry is calculated from the poster attributes and the target type with reference to the similarity. And extract information about the industry.
  • the similarity of the user's attributes is applied as the similarity of the poster's attributes.
  • the type of business is similar regardless of whether the user or the poster is used, We use the assumption that the behavior for the tag of interest is similar.
  • the similarity is calculated from the session data 41 (access history) for the user's search word, and the like is not limited to using the data to the tag.
  • the target determination model 25 is made to learn the selected items, with the explanatory variables as attributes and target types or preferences, and the objective variables as the distance between attributes and the number of visits (number of views).
  • a machine learning method such as Random Forest can be used.
  • the target user feature extraction server 1 of this embodiment extracts from the session data 41, the user attribute data 42, the page attribute data 43, and the poster attribute data 44 based on the target type desired by the poster.
  • the items and the range of values of the data 50 are determined to generate the data 50 to be extracted.
  • the user who accessed the web server 200 user terminal
  • the target user feature extraction server 1 can extract new users who are different from the poster's intention, and can also create a new business.
  • target user feature extraction server 1 can extract the features of the session data 41 of the extracted user features from the page attribute data 43, what kind of content (tag) of the poster's content 210 shows the user's interest. Can be narrowed down and marketing can be supported.
  • the user's industry is used as the user attribute data 42
  • the poster's industry is used as the poster attribute data 44
  • the present invention is not limited to this.
  • the hobbies and tastes of the user and the hobbies and tastes of the poster can be used as attribute data
  • the target user characteristics can be extracted from such attribute data.
  • the target determination model 25 when determining the items of the extraction target data 50 and the range of values, the target determination model 25 should be used even if the range conversion information 46 corresponding to the target type reflecting the preference of the poster does not exist. Therefore, it is possible to extract the user characteristics of the target type that the poster wants to acquire from the session data 41 and the like.
  • the target user feature extraction server 1 of the above embodiment can have the following configuration.
  • the poster data acquisition step of acquiring the poster attribute data (44) storing the attributes of the poster who provided the content (210) as the poster data (520), and the computer (1) extract the data.
  • the preference acquisition step of accepting the target contributor and acquiring the user information targeted by the contributor as the target type (461), and the computer (1) are the target type (461) of the contributor.
  • the target calculation step for calculating the item of the data to be extracted from and the range of the value of the item, and the computer (1) from the user data (510) and the poster data (520).
  • the session feature calculation step for calculating the extraction target data corresponding to the item, and the computer (1) range the value of the item from the extraction target data and the poster data (520).
  • a target user feature extraction method comprising an access feature extraction step (access feature extraction unit 27) for calculating an access feature amount based on the above.
  • the target user feature extraction server 1 of this embodiment changes from the session data 41, the user attribute data 42, the page attribute data 43, and the poster attribute data 44 to the target type desired by the poster. Based on this, the items and the range of values of the extraction target data 50 are determined to generate the extraction target data 50. Then, by inputting the value range and the extraction target data 50 into the access feature extraction unit 27, the user who accessed the web server 200 (user terminal) obtains the user features that the poster who provides the content 210 to the web server 200 wants to acquire. It is possible to extract from the history of 100). In addition, the target user feature extraction server 1 can extract new users who are different from the poster's intention, and can also create a new business.
  • target user feature extraction server 1 can extract the features of the session data 41 of the extracted user features from the page attribute data 43, what kind of content (tag) of the poster's content 210 shows the user's interest. Can be narrowed down and marketing can be supported.
  • the user's industry is used as the user attribute data 42
  • the poster's industry is used as the poster attribute data 44
  • the present invention is not limited to this.
  • the hobbies and tastes of the user and the hobbies and tastes of the poster can be used as attribute data
  • the target user characteristics can be extracted from such attribute data.
  • a target user feature extraction method which comprises a range conversion step (range conversion unit 24) for converting the item of
  • target user feature extraction method in the range conversion step (23), the target type (461) and the user are added to a preset determination model (target determination model 25).
  • a target user feature extraction method characterized in that data (510) and poster data (520) are input and an item of data to be extracted and a range of values of the item are output.
  • the target type (461), user data (510), and poster data (520) are input to the preset target determination model 25, and the data items to be extracted from the qualitative information and the above items. It is possible to calculate the range of values of.
  • the computer (1) determines the user data (510), the poster data (520), and the target type (461).
  • a target user feature extraction method characterized by further including a learning step (learning unit 31) given to a model (25) for learning.
  • the learning unit 31 generates the target determination model 25 by machine learning the session data 41, the user attribute data 42, the page attribute data 43, the poster attribute data 44, and the target type acquired from the web server 200. be able to.
  • the learning step (31) uses the user attribute data (42) to calculate the similarity between user attributes.
  • a target user feature extraction method comprising a calculation step (similarity calculation unit 29).
  • the access feature extraction unit 27 can calculate the distance of the feature amount (similarity) of the industry between a plurality of users who have accessed the visit page 413 using the industry 423 of the user attribute data 42.
  • the content 210 can be presented as a user feature 51 as a group of users according to the distance for each poster.
  • the similarity calculation unit 29 calculates the similarity between industries from the session data 41, the user attribute data 42, and the poster attribute data 44, and the access feature extraction unit 27 refers to this similarity to the poster. Information about the distance to the selected industry and the industry can be extracted from the attributes and target type of.
  • the present invention is not limited to the above-described embodiment, and includes various modifications.
  • the above-described embodiment is described in detail in order to explain the present invention in an easy-to-understand manner, and is not necessarily limited to the one including all the configurations described.
  • any of addition, deletion, or replacement of other configurations can be applied alone or in combination.
  • each of the above configurations, functions, processing units, processing means, etc. may be realized by hardware by designing a part or all of them by, for example, an integrated circuit. Further, each of the above configurations, functions, and the like may be realized by software by the processor interpreting and executing a program that realizes each function. Information such as programs, tables, and files that realize each function can be placed in a memory, a hard disk, a recording device such as an SSD (Solid State Drive), or a recording medium such as an IC card, an SD card, or a DVD.
  • SSD Solid State Drive
  • control lines and information lines indicate what is considered necessary for explanation, and not all control lines and information lines are necessarily shown on the product. In practice, it can be considered that almost all configurations are interconnected.

Abstract

In the present invention, a computer having a processor and a memory: acquires, as user data, session data storing historical information of a user terminal which has accessed content on a web server, and user attribute data storing user attribute information; acquires, as contributor data, page attribute data storing attributes of the content, and contributor attribute data storing attributes of a contributor that provided the content; acquires, as a target type, a contributor to be extracted and a user feature that the contributor sets as a capture target; calculates an item and a value range for data to be extracted from the target type; calculates the data to be extracted corresponding to the item from the user data and the contributor data; and calculates an access feature amount on the basis of the item value range from the contributor data and the data to be extracted.

Description

ターゲットユーザ特徴抽出方法、ターゲットユーザ特徴抽出システム及びターゲットユーザ特徴抽出サーバTarget user feature extraction method, target user feature extraction system and target user feature extraction server 参照による取り込みCapture by reference
 本出願は、令和2年(2020年)3月9日に出願された日本出願である特願2020-039825の優先権を主張し、その内容を参照することにより、本出願に取り込む。 This application claims the priority of Japanese Patent Application No. 2020-039825, which is a Japanese application filed on March 9, 2020, and incorporates it into this application by referring to its contents.
 本発明は、ウェブサイトを閲覧するユーザの履歴情報から特定のユーザ特徴を抽出する技術に関する。 The present invention relates to a technique for extracting a specific user feature from the history information of a user who browses a website.
 インターネット上のウェブサイトでは、コンテンツにアクセスするユーザの行動履歴(閲覧や視聴又は検索の履歴)に基づいて、コンテンツに表示する広告の内容を決定する技術が知られている。 On websites on the Internet, there is known a technique for determining the content of an advertisement to be displayed on the content based on the behavior history (browsing, viewing or searching history) of the user who accesses the content.
 また、商品やサービスを提供するウェブサイトでは、商品やレビューにアクセスするユーザの行動履歴に基づいてユーザの嗜好を推定し、推薦する商品やサービスを決定する技術が知られている。 In addition, on websites that provide products and services, there is known a technology that estimates user preferences based on the behavior history of users who access products and reviews, and determines recommended products and services.
 ウェブサイトにアクセスするユーザの嗜好を分析するウェブ解析の技術としては、例えば、特許文献1、2が知られている。 For example, Patent Documents 1 and 2 are known as a web analysis technique for analyzing the preference of a user who accesses a website.
 特許文献1には、ユーザ情報格納部及びユーザ履歴情報格納部を参照して、ユーザ毎に、レコメンド候補のアイテムを推定する技術が開示されている。また、特許文献2には、ユーザが選択したアイテムの選択履歴からユーザの嗜好分布を解析し、好分布の中心に近く、嗜好分布形状から離れている推薦指標を算出し、算出された前記推薦指標に基づいて、推薦するアイテムを表示する技術が開示されている。 Patent Document 1 discloses a technique for estimating a recommendation candidate item for each user by referring to the user information storage unit and the user history information storage unit. Further, in Patent Document 2, the user's preference distribution is analyzed from the selection history of the item selected by the user, a recommendation index close to the center of the favorable distribution and away from the preference distribution shape is calculated, and the calculated recommendation is obtained. A technique for displaying recommended items based on an index is disclosed.
特開2015-148975号公報Japanese Unexamined Patent Publication No. 2015-148975 特開2011-96025号公報Japanese Unexamined Patent Publication No. 2011-96025
 ウェブサイトにコンテンツ等の情報を提供する投稿者は、提供するコンテンツにアクセスするユーザに応じて、新たなビジネスの創出や、既存のユーザの掘り起こしを意図する場合がある。投稿者がターゲットとするユーザの特徴(以下、ユーザ特徴とする)を抽出する際に、重要視される項目は投稿者の嗜好やターゲット(獲得目標)に依存する。 A poster who provides information such as content to a website may intend to create a new business or dig up an existing user depending on the user who accesses the provided content. When extracting the characteristics of the user targeted by the poster (hereinafter referred to as user characteristics), the items to be emphasized depend on the preference and target (acquisition target) of the poster.
 例えば、投稿者には、現在のビジネスを広げるため、既存の顧客へのアプローチを重要視する者もいれば、新たなビジネスを創出するために、潜在する顧客を探ることを重要視する者もいる。投稿者がターゲットとするユーザ特徴を抽出するためには、ユーザのウェブ解析に加えて、投稿者の意図を的確に抽出することが必要となる。 For example, some contributors emphasize approaching existing customers in order to expand their current business, while others emphasize exploring potential customers in order to create new businesses. There is. In order to extract the user characteristics targeted by the poster, it is necessary to accurately extract the intention of the poster in addition to the web analysis of the user.
 上記従来例の特許文献1では、ユーザの嗜好を判定した後、ユーザの行動履歴からユーザに情報を提示する計画を算出している。この特許文献1に開示されている嗜好判定部では、類似度の判定にはユーザの属性情報と履歴情報を用いているのみで、アイテム(コンテンツ)を提供する側のターゲットに関する意図(又は嗜好)は考慮されていない、という問題があった。 In Patent Document 1 of the above-mentioned conventional example, after determining a user's preference, a plan for presenting information to the user is calculated from the user's behavior history. The preference determination unit disclosed in Patent Document 1 only uses the user's attribute information and history information to determine the degree of similarity, and the intention (or preference) regarding the target on the side of providing the item (content). Was not taken into account.
 また、従来例の特許文献2では、ユーザの嗜好を解析して、嗜好分布形状から離れている推薦指標を算出して意外性のあるアイテムを提供している。しかし、この特許文献2では、アイテムの提供者が意図するターゲットについては考慮されていない、という問題があった。 Further, in Patent Document 2 of the conventional example, a user's preference is analyzed, a recommendation index away from the preference distribution shape is calculated, and an unexpected item is provided. However, there is a problem that this Patent Document 2 does not consider the target intended by the provider of the item.
 そこで本発明は、上記問題点に鑑みてなされたもので、ウェブサイトに情報を提供する投稿者が獲得したいユーザ特徴を、ウェブサイトにアクセスしたユーザの履歴から抽出することを目的とする。 Therefore, the present invention has been made in view of the above problems, and an object of the present invention is to extract user characteristics that a poster who provides information to a website wants to acquire from the history of users who have accessed the website.
 本発明は、プロセッサとメモリを有する計算機が、ウェブサーバのコンテンツにアクセスした履歴情報から投稿者が獲得目標とするユーザ特徴を抽出するターゲットユーザ特徴抽出方法であって、前記計算機が、前記ウェブサーバのコンテンツにアクセスしたユーザ端末の履歴情報を格納したセッションデータと、前記ユーザ端末を利用するユーザの属性情報を格納したユーザ属性データと、をユーザデータとして取得するユーザデータ取得ステップと、前記計算機が、前記コンテンツの属性を格納したページ属性データと、前記コンテンツを提供した前記投稿者の属性を格納した投稿者属性データと、を投稿者データとして取得する投稿者データ取得ステップと、前記計算機が、抽出対象とする投稿者を受け付けて、前記投稿者が獲得目標とするユーザの情報をターゲットタイプとして取得する嗜好取得ステップと、前記計算機が、前記投稿者のターゲットタイプから抽出対象のデータの項目と前記項目の値の範囲を算出するターゲット算出ステップと、前記計算機が、前記ユーザデータと前記投稿者データから前記項目に対応する抽出対象データを算出するセッション特徴算出ステップと、前記計算機が、前記抽出対象データと前記投稿者データから前記項目の値の範囲に基づいてアクセスの特徴量を算出するアクセス特徴抽出ステップと、を含む。 The present invention is a target user feature extraction method in which a computer having a processor and a memory extracts user features targeted by a poster from history information of accessing the contents of a web server, wherein the computer is the web server. The user data acquisition step of acquiring the session data storing the history information of the user terminal that has accessed the contents of the above and the user attribute data storing the attribute information of the user who uses the user terminal as user data, and the computer , The poster data acquisition step of acquiring the page attribute data storing the attributes of the content and the poster attribute data storing the attributes of the poster who provided the content as poster data, and the computer The preference acquisition step of accepting the poster to be extracted and acquiring the information of the user targeted by the poster as the target type, and the item of the data to be extracted from the target type of the poster by the computer. A target calculation step for calculating a range of values of the item, a session feature calculation step for the computer to calculate extraction target data corresponding to the item from the user data and the poster data, and the extraction by the computer. It includes an access feature extraction step of calculating an access feature amount based on a range of values of the item from the target data and the poster data.
 したがって、本発明は、コンテンツを提供する投稿者の嗜好に応じたユーザ特徴を、ウェブサイトにアクセスしたユーザの履歴情報から抽出することが可能となる。これにより、情報の投稿者が期待するユーザの抽出に加えて、投稿者の意図とは異なる新規のユーザ特徴も抽出することが可能となって、新たなビジネスの創出することも可能となる。 Therefore, the present invention makes it possible to extract user characteristics according to the preference of the poster who provides the content from the history information of the user who has accessed the website. As a result, in addition to extracting the users expected by the poster of the information, it is possible to extract new user characteristics that are different from the intention of the poster, and it is possible to create a new business.
 本明細書において開示される主題の、少なくとも一つの実施の詳細は、添付されている図面と以下の記述の中で述べられる。開示される主題のその他の特徴、態様、効果は、以下の開示、図面、請求項により明らかにされる。 Details of at least one implementation of the subject matter disclosed herein are described in the accompanying drawings and in the description below. Other features, aspects, and effects of the disclosed subject matter are manifested in the disclosures, drawings, and claims below.
本発明の実施例を示し、ターゲットユーザ特徴抽出システムの構成の一例を示すブロック図である。It is a block diagram which shows the Example of this invention and shows an example of the structure of the target user feature extraction system. 本発明の実施例を示し、ターゲットユーザ特徴抽出サーバの構成の一例を示すブロック図である。It is a block diagram which shows the Example of this invention and shows an example of the structure of the target user feature extraction server. 本発明の実施例を示し、ターゲットユーザ特徴抽出サーバで行われる処理の概要を示す図である。It is a figure which shows the Example of this invention and shows the outline of the process performed in the target user feature extraction server. 本発明の実施例を示し、セッションデータの一例を示す図である。It is a figure which shows the Example of this invention and shows an example of a session data. 本発明の実施例を示し、ユーザ属性データの一例を示す図である。It is a figure which shows the Example of this invention and shows an example of the user attribute data. 本発明の実施例を示し、抽出対象データの一例を示す図である。It is a figure which shows the Example of this invention and shows an example of the data to be extracted. 本発明の実施例を示し、範囲変換情報の一例を示す図である。It is a figure which shows the Example of this invention and shows an example of the range conversion information. 本発明の実施例を示し、ターゲットユーザ特徴抽出サーバのセッション特徴算出部で行われる処理の一例を示すフローチャートである。It is a flowchart which shows the Example of this invention and shows an example of the processing performed in the session feature calculation part of the target user feature extraction server. 本発明の実施例を示し、ターゲット算出部で行われる処理の一例を示すフローチャートである。It is a flowchart which shows the Example of this invention and shows an example of the processing performed in the target calculation part. 本発明の実施例を示し、範囲変換部で行われる処理の一例を示すフローチャートである。It is a flowchart which shows the Example of this invention and shows an example of the processing performed in the range conversion part. 本発明の実施例を示し、範囲変換部で行われる処理の一例を示す図である。It is a figure which shows the Example of this invention and shows an example of the process performed in the range conversion part. 本発明の実施例を示し、ターゲット判定項目加工部で行われる処理の一例を示すフローチャートである。It is a flowchart which shows the Example of this invention and shows an example of the processing performed in the target determination item processing part. 本発明の実施例を示し、学習用の選択データの一例を示す図である。It is a figure which shows the Example of this invention and shows an example of the selection data for learning. 本発明の実施例を示し、選択データの一例を示すグラフである。It is a graph which shows the Example of this invention and shows an example of selection data. 本発明の実施例を示し、カテゴリテーブルの一例を示す図である。It is a figure which shows the Example of this invention and shows an example of the category table. 本発明の実施例を示し、条件テーブルの一例を示す図である。It is a figure which shows the Example of this invention and shows an example of the condition table. 本発明の実施例を示し、選択データの一例を示すグラフである。It is a graph which shows the Example of this invention and shows an example of selection data. 本発明の実施例を示し、選択データの一例を示す図である。It is a figure which shows the Example of this invention and shows an example of selection data. 本発明の実施例を示し、業種類似度マップの一例を示す図である。It is a figure which shows the Example of this invention and shows an example of the industry similarity degree map. 本発明の実施例を示し、閲覧数データの一例を示す図である。It is a figure which shows the Example of this invention and shows an example of the browsing number data. 本発明の実施例を示し、業種類似度マップの一例を示す図である。It is a figure which shows the Example of this invention and shows an example of the industry similarity degree map. 本発明の実施例を示し、統計データの一例を示す図である。It is a figure which shows the Example of this invention and shows an example of statistical data. 本発明の実施例を示し、業種類似度マップの一例を示す図である。It is a figure which shows the Example of this invention and shows an example of the industry similarity degree map. 本発明の実施例を示し、抽出結果画面の一例を示す図である。It is a figure which shows the Example of this invention and shows an example of the extraction result screen. 本発明の実施例を示し、抽出対象の一例を示す図である。It is a figure which shows the Example of this invention and shows an example of the extraction target.
 以下、本発明の実施形態を添付図面に基づいて説明する。 Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings.
 図1は、本発明の実施例を示し、ターゲットユーザ特徴抽出システムの構成の一例を示すブロック図である。 FIG. 1 is a block diagram showing an embodiment of the present invention and showing an example of the configuration of a target user feature extraction system.
 ターゲットユーザ特徴抽出システムは、コンテンツ210及び広告220を含むウェブサイトを管理するウェブサーバ200と、ウェブサイトの情報にアクセスするユーザ端末100-1~100-3と、ウェブサーバ200に情報を供給する投稿端末300-1~300-3と、投稿端末300-1~300-3から情報を提供する投稿者が獲得したいユーザ(ターゲットタイプ)を、ウェブサーバ200のアクセス履歴(ログ230)から抽出するターゲットユーザ特徴抽出サーバ1を含む。 The target user feature extraction system supplies information to the web server 200 that manages the website including the content 210 and the advertisement 220, the user terminals 100-1 to 100-3 that access the website information, and the web server 200. The users (target types) that the posters who provide information from the posting terminals 300-1 to 300-3 and the posting terminals 300-1 to 300-3 want to acquire are extracted from the access history (log 230) of the web server 200. The target user feature extraction server 1 is included.
 ユーザ端末100-1~100-3の符号は、個々に特定しない場合では「-」以降を省略した符号「100」を用いる。他の構成要素の符号についても同様の符号を用いる。 As the code of the user terminals 100-1 to 100-3, the code "100" is used, omitting the "-" and subsequent parts when not individually specified. Similar codes are used for the codes of other components.
 投稿端末300-1~300-3は、それぞれ異なる業種の投稿者A、B、Cによって運用され、各投稿者A~Cは、広告主も兼ねてコンテンツ210と広告220を提供する。 Posting terminals 300-1 to 300-3 are operated by contributors A, B, and C in different industries, and each contributor A to C also serves as an advertiser to provide content 210 and advertisement 220.
 なお、本実施例では、投稿端末300を運用する投稿者がコンテンツ210の提供と広告主を兼ねる例を示すが、これに限定されるものではなく、コンテンツ210の投稿者と広告主が異なってもよい。また、ユーザ端末100-1~100-3は、それぞれ異なる業種a、b、cのユーザによって運用され、ウェブサーバ200のコンテンツ210や広告220を閲覧する。 In this embodiment, the poster who operates the posting terminal 300 serves as both the provider of the content 210 and the advertiser, but the present invention is not limited to this, and the poster and the advertiser of the content 210 are different. May be good. Further, the user terminals 100-1 to 100-3 are operated by users in different industries a, b, and c, and browse the contents 210 and the advertisement 220 of the web server 200.
 ウェブサーバ200は計算機で構成され、ユーザ端末100のアクセス履歴(履歴情報)と、投稿端末300を利用する投稿者の情報と、コンテンツ210の属性データをターゲットユーザ特徴抽出サーバ1へ送信する。なお、ウェブサーバ200は、データベースサーバやアプリケーションサーバ等に接続されて、ウェブサイトを構築してもよい。 The web server 200 is composed of a computer, and transmits the access history (history information) of the user terminal 100, the information of the poster who uses the posting terminal 300, and the attribute data of the content 210 to the target user feature extraction server 1. The web server 200 may be connected to a database server, an application server, or the like to build a website.
 ターゲットユーザ特徴抽出サーバ1は、ウェブサーバ200が提供するウェブサイトに情報を提供する投稿者A~Cが獲得したいユーザ(ユーザ端末100の利用者)を、ウェブサーバ200にアクセスしたユーザの履歴(セッションデータ)からユーザ特徴を抽出する。また、ターゲットユーザ特徴抽出サーバ1は、投稿端末300が提供したコンテンツ(ページ)210を分析してページ特徴として抽出する。 The target user feature extraction server 1 is a history of users who have accessed the web server 200 from the users (users of the user terminal 100) that the posters A to C who provide information to the website provided by the web server 200 want to acquire. Extract user characteristics from session data). Further, the target user feature extraction server 1 analyzes the content (page) 210 provided by the posting terminal 300 and extracts it as a page feature.
 ターゲットユーザ特徴抽出サーバ1は、ユーザ端末100のアクセスの履歴を所定の周期(例えば、1ヶ月)で収集して、抽出対象の投稿者についてユーザ特徴とページ特徴を含むアクセスの特徴量を抽出して投稿端末300へ通知する。 The target user feature extraction server 1 collects the access history of the user terminal 100 at a predetermined cycle (for example, one month), and extracts the access feature amount including the user feature and the page feature for the poster to be extracted. Notify the posting terminal 300.
 なお、投稿端末300は、投稿者が獲得したいユーザの情報をターゲットタイプとして、ターゲットユーザ特徴抽出サーバ1へ予め通知しておく。あるいは、投稿者はターゲットタイプを投稿端末300からウェブサーバ200に通知し、ターゲットユーザ特徴抽出サーバ1がウェブサーバ200からターゲットタイプを取得するようにしてもよい。 The posting terminal 300 notifies the target user feature extraction server 1 in advance of the user information that the poster wants to acquire as the target type. Alternatively, the poster may notify the web server 200 of the target type from the posting terminal 300, and the target user feature extraction server 1 may acquire the target type from the web server 200.
 図2は、ターゲットユーザ特徴抽出サーバ1の構成の一例を示すブロック図である。ターゲットユーザ特徴抽出サーバ1は、プロセッサ11と、メモリ12と、ストレージ装置13と、入力装置14と、出力装置15と、通信装置16を含む計算機である。 FIG. 2 is a block diagram showing an example of the configuration of the target user feature extraction server 1. The target user feature extraction server 1 is a computer including a processor 11, a memory 12, a storage device 13, an input device 14, an output device 15, and a communication device 16.
 通信装置16は、ネットワーク400に接続されてウェブサーバ200及び投稿端末300と通信を行う。出力装置15は、ディスプレイ等で構成される。入力装置14は、キーボードやマウス又はタッチパネルで構成される。 The communication device 16 is connected to the network 400 and communicates with the web server 200 and the posting terminal 300. The output device 15 is composed of a display or the like. The input device 14 is composed of a keyboard, a mouse, or a touch panel.
 メモリ12には、処理対象選択部21と、セッション特徴算出部22と、ターゲット算出部23と、アクセス特徴抽出部27と、ターゲット判定項目加工部28と、データ加工部30と、学習部31がプログラムとしてロードされ、プロセッサ11によって実行される。 The memory 12 includes a processing target selection unit 21, a session feature calculation unit 22, a target calculation unit 23, an access feature extraction unit 27, a target determination item processing unit 28, a data processing unit 30, and a learning unit 31. It is loaded as a program and executed by the processor 11.
 プロセッサ11は、各機能部のプログラムに従って処理を実行することによって、所定の機能を提供する機能部として稼働する。例えば、プロセッサ11は、セッション特徴抽出プログラムに従って処理を実行することでセッション特徴算出部22として機能する。他のプログラムについても同様である。さらに、プロセッサ11は、各プログラムが実行する複数の処理のそれぞれの機能を提供する機能部としても稼働する。計算機及び計算機システムは、これらの機能部を含む装置及びシステムである。 The processor 11 operates as a functional unit that provides a predetermined function by executing processing according to the program of each functional unit. For example, the processor 11 functions as the session feature calculation unit 22 by executing the process according to the session feature extraction program. The same applies to other programs. Further, the processor 11 also operates as a functional unit that provides each function of a plurality of processes executed by each program. A computer and a computer system are devices and systems including these functional parts.
 ストレージ装置13には、上記各プログラムが使用するデータとして、セッションデータ41と、ユーザ属性データ42と、ページ属性データ43と、投稿者属性データ44と、投稿者ターゲットデータ45と、範囲変換情報46が格納される。 In the storage device 13, session data 41, user attribute data 42, page attribute data 43, poster attribute data 44, poster target data 45, and range conversion information 46 are used as data used by each of the above programs. Is stored.
 セッションデータ41は、ウェブサーバ200が収集したログ230のうち、コンテンツ210(又は広告220)にアクセスしたユーザ端末100のアクセス履歴を示す。ユーザ属性データ42は、ユーザ端末100を利用するユーザの属性を示す。ページ属性データ43は、コンテンツ210の属性を示す。投稿者属性データ44は、投稿者の属性を示す。投稿者ターゲットデータ45は、投稿者A~Cが獲得したいユーザ層(ターゲットタイプ)が定性的な情報として設定される。なお、投稿者ターゲットデータ45は、項目と値の範囲(又は閾値)を設定することもできる。範囲変換情報46には、ターゲットタイプ毎にユーザ層を特定する分析対象データの項目と、項目の値の範囲(又は閾値)が設定される。なお、各データの詳細については後述する。 The session data 41 shows the access history of the user terminal 100 that has accessed the content 210 (or the advertisement 220) among the logs 230 collected by the web server 200. The user attribute data 42 indicates the attributes of the user who uses the user terminal 100. The page attribute data 43 indicates the attributes of the content 210. The poster attribute data 44 indicates the attribute of the poster. In the poster target data 45, the user group (target type) that the posters A to C want to acquire is set as qualitative information. The poster target data 45 can also set a range (or threshold value) of items and values. In the range conversion information 46, an item of analysis target data that specifies a user group for each target type and a range (or threshold value) of the value of the item are set. The details of each data will be described later.
 次に、ターゲットユーザ特徴抽出サーバ1で稼働する各プログラムの概要について説明する。 Next, the outline of each program running on the target user feature extraction server 1 will be described.
 処理対象選択部21は、ウェブサーバ200から取得したユーザ端末100のアクセス履歴(セッションデータ41)のうち、分析に使用するセッションデータ41の期間と分析対象の投稿者を入力装置14等から受け付ける。なお、投稿者を指定せずに、セッションデータ441の期間を指定して、全ての投稿者のターゲットについて分析を実施してもよい。 The processing target selection unit 21 receives the period of the session data 41 used for analysis and the poster to be analyzed from the input device 14 or the like in the access history (session data 41) of the user terminal 100 acquired from the web server 200. It should be noted that the analysis may be performed on the targets of all the contributors by designating the period of the session data 441 without designating the contributors.
 ターゲット算出部23は、分析対象の投稿者を受け付けて、投稿者が獲得したいユーザ特徴の範囲を投稿者ターゲットデータ45からターゲット情報として取得し、ターゲット情報に基づいてセッションデータを分析する項目と、値の範囲を決定する。 The target calculation unit 23 accepts the poster to be analyzed, acquires the range of user characteristics that the poster wants to acquire from the poster target data 45 as target information, and analyzes the session data based on the target information. Determine the range of values.
 セッションデータを分析する項目と範囲は、後述するように、投稿者A~C毎のターゲット情報(獲得目標)や嗜好に応じて設定され、例えば、投稿者のターゲットを判定する項目として、ウェブサーバ200での閲覧数と滞在時間などを用い、範囲はこれらの数値の範囲や閾値などで指定することができる。 As will be described later, the items and ranges for analyzing session data are set according to the target information (acquisition target) and preference for each poster A to C. For example, as an item for determining the target of a poster, a web server. Using the number of views and staying time at 200, the range can be specified by the range of these numerical values, the threshold value, and the like.
 セッションデータ41を分析する項目と範囲は、範囲算出部26が範囲変換情報46を参照して決定する場合と、ターゲット判定モデル25が項目と範囲を算出する場合がある。 The item and range for analyzing the session data 41 may be determined by the range calculation unit 26 with reference to the range conversion information 46, or the target determination model 25 may calculate the item and range.
 範囲変換部24は、投稿者ターゲットデータ45のターゲット情報に対応する範囲変換情報46が存在する場合には、範囲算出部26に範囲変換情報46を参照させて項目と範囲を決定させる。また、範囲変換部24は、ターゲット情報に対応する範囲変換情報46が存在しない場合には、指定された期間のセッションデータ41とユーザ属性データ42とページ属性データ43と投稿者属性データ44及びターゲット情報をターゲット判定モデル25へ入力して、分析対象の項目と範囲を生成させる。 When the range conversion information 46 corresponding to the target information of the poster target data 45 exists, the range conversion unit 24 causes the range calculation unit 26 to refer to the range conversion information 46 to determine the item and the range. Further, when the range conversion information 46 corresponding to the target information does not exist, the range conversion unit 24 has the session data 41, the user attribute data 42, the page attribute data 43, the poster attribute data 44, and the target for the specified period. Information is input to the target determination model 25 to generate items and ranges to be analyzed.
 セッション特徴算出部22は、処理対象選択部21が受け付けた期間のセッションデータ41と、セッションデータ41に含まれるユーザのユーザ属性データ42とページ属性データ43及び投稿者属性データ44を取得し、ターゲット算出部23が決定した項目のデータをセッションの特徴を示す抽出対象データとして生成する。なお、セッションデータ41に上記決定された項目が存在する場合には、指定された期間のセッションデータ41を抽出対象データ50とする。 The session feature calculation unit 22 acquires the session data 41 of the period accepted by the processing target selection unit 21, the user attribute data 42 of the user, the page attribute data 43, and the poster attribute data 44 included in the session data 41, and targets the target. The data of the items determined by the calculation unit 23 is generated as the extraction target data indicating the characteristics of the session. If the determined item exists in the session data 41, the session data 41 for the specified period is set as the extraction target data 50.
 セッション特徴算出部22は、後述するように、ターゲット判定項目加工部28とデータ加工部30を使用して、決定された項目に対応する抽出対象データ50を生成する。なお、ターゲット判定項目加工部28では、ターゲットの項目に応じて類似度算出部29を使用する。 The session feature calculation unit 22 uses the target determination item processing unit 28 and the data processing unit 30 to generate the extraction target data 50 corresponding to the determined item, as will be described later. The target determination item processing unit 28 uses the similarity calculation unit 29 according to the target item.
 また、セッション特徴算出部22は、ユーザ端末100がアクセスしたコンテンツ210のページ毎、又はページ属性データ43のタグ毎、あるいは、コンテンツ210を提供する投稿者毎に、ユーザ端末100がウェブサーバ200を訪問した履歴を、セッションの特徴を示すデータとして算出することができる。 Further, in the session feature calculation unit 22, the user terminal 100 sets the web server 200 for each page of the content 210 accessed by the user terminal 100, for each tag of the page attribute data 43, or for each poster who provides the content 210. The visit history can be calculated as data showing the characteristics of the session.
 アクセス特徴抽出部27は、セッション特徴算出部22からの抽出対象データと、ターゲット算出部23からの項目と範囲を受け付けてユーザ特徴とアクセスされたコンテンツ210の特徴(ページ特徴)を抽出する。 The access feature extraction unit 27 receives the extraction target data from the session feature calculation unit 22 and the items and ranges from the target calculation unit 23, and extracts the user features and the features (page features) of the accessed content 210.
 アクセス特徴抽出部27は、まず、セッション特徴算出部22からの抽出対象データと、ターゲット算出部23からの項目と、項目の値の範囲に基づいてユーザの特徴量をユーザ特徴として算出する。また、アクセス特徴抽出部27は、投稿者の属性データ(投稿者属性データ44)と、投稿者がウェブサーバ200に提供したコンテンツ210の属性データ(ページ属性データ43)を受け付けて、ユーザのアクセス等に関するコンテンツ210の特徴量をページ特徴として抽出する。 First, the access feature extraction unit 27 calculates the user feature amount as the user feature based on the extraction target data from the session feature calculation unit 22, the item from the target calculation unit 23, and the range of the value of the item. Further, the access feature extraction unit 27 receives the attribute data of the poster (poster attribute data 44) and the attribute data of the content 210 provided by the poster to the web server 200 (page attribute data 43), and is accessed by the user. The feature amount of the content 210 related to the above is extracted as a page feature.
 アクセス特徴抽出部27が抽出したユーザ特徴とページ特徴は、投稿端末300に通知される。また、アクセス特徴抽出部27は、抽出したユーザ特徴とページ特徴を出力装置15に表示することができる。 The user features and page features extracted by the access feature extraction unit 27 are notified to the posting terminal 300. In addition, the access feature extraction unit 27 can display the extracted user features and page features on the output device 15.
 アクセス特徴抽出部27が抽出するユーザ特徴は、特徴量として、例えば、抽出対象の投稿者のコンテンツ210をアクセスしたユーザの業種の比率や、セッションの特性(リピート回数の多寡)等を含むことができる。 The user features extracted by the access feature extraction unit 27 may include, for example, the ratio of the type of industry of the user who accessed the content 210 of the poster to be extracted, the characteristics of the session (the number of repeats), and the like as the feature amount. can.
 また、アクセス特徴抽出部27が抽出するページ特徴は、特徴量として、例えば、アクセスされたコンテンツ210のタグの比率や、各ページの平均滞在時間等を含むことができる。 Further, the page feature extracted by the access feature extraction unit 27 can include, for example, the ratio of tags of the accessed content 210, the average staying time of each page, and the like as the feature amount.
 学習部31は、セッションデータ41とユーザ属性データ42と投稿者属性データ44とページ属性データ43と投稿者ターゲットデータ45を入力して機械学習を実施して、ターゲット判定モデル25を生成する。ターゲット判定モデル25の生成は、ユーザ特徴51やページ特徴52を抽出する以前に予め実施しておく。 The learning unit 31 inputs session data 41, user attribute data 42, poster attribute data 44, page attribute data 43, and poster target data 45, performs machine learning, and generates a target determination model 25. The target determination model 25 is generated in advance before the user feature 51 and the page feature 52 are extracted.
 <データ>
 次に、各プログラムが利用するデータについて説明する。図4は、セッションデータ41の一例を示す図である。セッションデータ41は、ターゲットユーザ特徴抽出サーバ1がウェブサーバ200から所定の周期等で収集した履歴情報である。
<Data>
Next, the data used by each program will be described. FIG. 4 is a diagram showing an example of session data 41. The session data 41 is historical information collected by the target user feature extraction server 1 from the web server 200 at a predetermined cycle or the like.
 セッションデータ41は、ID411と、アクセス時刻412と、訪問ページ413と、リピート回数414と、離脱時刻415をひとつのレコードに含むテーブルである。 The session data 41 is a table that includes the ID 411, the access time 412, the visit page 413, the number of repeats 414, and the departure time 415 in one record.
 ID411は、ユーザ端末100の識別子が格納される。ID411は、ウェブサーバ200が付与した値で、ターゲットユーザ特徴抽出システム内でユニークな値であればよい。 ID411 stores the identifier of the user terminal 100. ID411 is a value given by the web server 200, and may be a unique value in the target user feature extraction system.
 アクセス時刻412は、ユーザ端末100が当該ページにアクセスを開始した日時を格納する。訪問ページ413は、ユーザ端末100がアクセスしたコンテンツ210のURLを格納する。 The access time 412 stores the date and time when the user terminal 100 started accessing the page. The visit page 413 stores the URL of the content 210 accessed by the user terminal 100.
 リピート回数414は、当該ページにアクセスした累計回数を格納する。離脱時刻415は、ユーザ端末100が当該ページの閲覧を終了した時刻を格納する。 The repeat count 414 stores the cumulative number of times the page has been accessed. The withdrawal time 415 stores the time when the user terminal 100 finishes browsing the page.
 図5は、ユーザ属性データ42の一例を示す図である。ユーザ属性データ42は、ターゲットユーザ特徴抽出サーバ1で設定するテーブルである。ユーザ属性データ42は、ID421と、IP422と、業種423と、売上424をひとつのレコードに含むテーブルである。 FIG. 5 is a diagram showing an example of user attribute data 42. The user attribute data 42 is a table set by the target user feature extraction server 1. The user attribute data 42 is a table that includes ID 421, IP 422, industry 423, and sales 424 in one record.
 ID421は、ユーザ端末100の識別子を格納する。ID421は、セッションデータ41のID411と同じ値である。IP422は、ユーザ端末100のIPアドレスを格納する。 ID 421 stores the identifier of the user terminal 100. ID 421 is the same value as ID 411 of the session data 41. IP422 stores the IP address of the user terminal 100.
 業種423は、ユーザ端末100を利用するユーザの会社(又は団体)の業種を格納する。業種423は、ユーザ端末100のIPアドレスからユーザが所属する会社を特定できるので、当該会社の情報から業種を決定すればよい。売上424は、ユーザが所属する会社の売上高を格納する。 The industry 423 stores the industry of the user's company (or group) that uses the user terminal 100. Since the industry 423 can identify the company to which the user belongs from the IP address of the user terminal 100, the industry may be determined from the information of the company. Sales 424 stores the sales of the company to which the user belongs.
 なお、ユーザ端末100を利用するユーザの業種や売上高は、ターゲットユーザ特徴抽出サーバ1の管理者などが設定してもよいし、予め設定されたデータベースなどから設定してもよい。 The type of business and sales of the user who uses the user terminal 100 may be set by the administrator of the target user feature extraction server 1 or the like, or may be set from a preset database or the like.
 図6は、抽出対象データ50の一例を示す図である。抽出対象データ50は、セッション特徴算出部22で算出される中間データである。図示の例では、ユーザ層を特定する抽出対象データの項目として、閲覧数と平均滞在時間がターゲット算出部23から出力された例を示す。 FIG. 6 is a diagram showing an example of the extraction target data 50. The extraction target data 50 is intermediate data calculated by the session feature calculation unit 22. In the illustrated example, the number of views and the average staying time are output from the target calculation unit 23 as the items of the extraction target data for specifying the user group.
 図示の抽出対象データ50の場合、処理対象選択部21が受け付けた期間内のセッションデータ41から、セッション特徴算出部22は各ユーザが閲覧したページを投稿者毎に集計して、ユーザ属性データ42と結合した例を示す。 In the case of the illustrated extraction target data 50, the session feature calculation unit 22 aggregates the pages viewed by each user for each contributor from the session data 41 within the period accepted by the processing target selection unit 21, and the user attribute data 42. Here is an example combined with.
 抽出対象データ50は、ID501と、投稿者502と、閲覧数503と、平均滞在時間504と、業種505をひとつのレコードに含むテーブルである。 The extraction target data 50 is a table that includes ID 501, contributor 502, number of views 503, average stay time 504, and industry 505 in one record.
 ID501は、ユーザ端末100の識別子が格納される。ID501は、セッションデータ41のID411と同じ値である。投稿者502は、当該ID501のユーザが閲覧したコンテンツ210の投稿者の識別子が格納される。コンテンツ210の投稿者の識別子は、コンテンツ210を構成するページ毎に予め設定された情報で、ウェブサーバ200から送信されたページ属性データ43から取得する。 The ID 501 stores the identifier of the user terminal 100. ID501 is the same value as ID411 of the session data 41. The contributor 502 stores the identifier of the contributor of the content 210 viewed by the user of the ID 501. The identifier of the poster of the content 210 is information preset for each page constituting the content 210, and is acquired from the page attribute data 43 transmitted from the web server 200.
 閲覧数503は、当該ID501のユーザが閲覧した投稿者502が提供したページの合計を格納する。平均滞在時間504は、当該ID501のユーザが投稿者502によって提供されたページに滞在(閲覧)した平均時間を格納する。業種505は、ユーザ属性データ42の業種423を格納する。 The number of views 503 stores the total number of pages provided by the poster 502 viewed by the user of the ID 501. The average stay time 504 stores the average time that the user of the ID 501 stays (views) on the page provided by the poster 502. The industry 505 stores the industry 423 of the user attribute data 42.
 図7は、範囲変換情報46の一例を示す図である。範囲変換情報46は、投稿者ターゲットデータ45の定性的な情報を、抽出対象の項目と値の範囲に変換するためのテーブルである。 FIG. 7 is a diagram showing an example of the range conversion information 46. The range conversion information 46 is a table for converting the qualitative information of the poster target data 45 into a range of items and values to be extracted.
 範囲変換情報46は、投稿者が獲得したいユーザ層を分類するターゲットタイプ461毎に、セッションデータ41及びユーザ属性データ42等から算出する抽出対象データ50の項目とデータの範囲462を予め設定した情報である。なお、ターゲットタイプ461は、投稿者ターゲットデータ45のターゲット情報の値である。 The range conversion information 46 is information in which the items of the extraction target data 50 calculated from the session data 41, the user attribute data 42, and the like and the data range 462 are preset for each target type 461 that classifies the user group that the poster wants to acquire. Is. The target type 461 is a value of the target information of the poster target data 45.
 ターゲットタイプ461の一例として、「新規」、「既存」、「時間をかけて購読する人」、「リピータ」、「優良顧客」、「切削に興味のある人」が設定された例を示している。 As an example of the target type 461, an example in which "new", "existing", "people who subscribe over time", "repeater", "good customer", and "people who are interested in cutting" are set is shown. There is.
 「新規」のターゲットタイプ461は、投稿者が新たなユーザを獲得することを目的としてウェブサーバ200のウェブサイトにコンテンツ210及び広告220の情報を提供することを示す。本実施例では、該当する投稿者のコンテンツ210の閲覧数が50回以下のユーザを「新規」のユーザとする範囲462が予め設定される。 The "new" target type 461 indicates that the poster provides information on the content 210 and the advertisement 220 to the website of the web server 200 for the purpose of acquiring new users. In this embodiment, a range 462 is set in advance in which a user who has viewed the content 210 of the corresponding poster 50 times or less is regarded as a "new" user.
 「既存」のターゲットタイプ461は、投稿者が、既存のユーザを掘り起こすことを目的としてウェブサーバ200に情報を提供することを示す。本実施例では、該当する投稿者のコンテンツ210の閲覧数が50を超えるユーザを「既存」のユーザとする範囲462が予め設定される。 The "existing" target type 461 indicates that the poster provides information to the web server 200 for the purpose of digging up existing users. In this embodiment, a range 462 is set in advance in which a user whose content 210 is viewed by the corresponding contributor exceeds 50 as an "existing" user.
 「時間をかけて購読する人」のターゲットタイプ461は、投稿者のコンテンツ210を時間をかけて閲覧するユーザを獲得することを目的として、ウェブサーバ200にコンテンツ210を提供することを示す。本実施例では、該当する投稿者のコンテンツ210の平均滞在時間504がページ当たり500秒以上のユーザを該当するユーザとして判定する範囲462が予め設定される。 The target type 461 of the "person who subscribes over time" indicates that the content 210 is provided to the web server 200 for the purpose of acquiring users who browse the content 210 of the poster over time. In this embodiment, a range 462 for determining a user whose average staying time 504 of the content 210 of the corresponding poster is 500 seconds or more per page as the corresponding user is set in advance.
 「リピータ」のターゲットタイプ461は、投稿者のコンテンツ210を繰り返して閲覧するユーザを獲得することを目的としてウェブサーバ200に情報を提供することを示す。本実施例では、該当する投稿者のコンテンツ210のリピート回数414が2回以上で、訪問間隔が1週間以下のユーザを該当するユーザとして判定する範囲462が予め設定される。 The target type 461 of the "repeater" indicates that information is provided to the web server 200 for the purpose of acquiring users who repeatedly browse the content 210 of the poster. In this embodiment, a range 462 for determining a user whose content 210 of the corresponding poster has a repeat number of 414 of 2 or more and a visit interval of 1 week or less as the corresponding user is set in advance.
 「優良顧客」のターゲットタイプ461は、投稿者のコンテンツ210にアクセスするユーザのうち、ユーザが所属する会社の売上424が10億円以上のユーザを該当するユーザとして判定する範囲462が予め設定される。 The target type 461 of the "excellent customer" is preset with a range 462 for determining a user who accesses the content 210 of the poster and whose sales 424 of the company to which the user belongs is 1 billion yen or more as the corresponding user. NS.
 「切削に興味のある人」のターゲットタイプ461は、投稿者のコンテンツ210のうち、「切削」のタグを含むページにアクセスしたユーザを、該当するユーザとして判定する範囲462が予め設定される。 The target type 461 of the "person who is interested in cutting" is preset with a range 462 for determining the user who has accessed the page including the "cutting" tag in the content 210 of the poster as the corresponding user.
 なお、投稿者ターゲットデータ45のターゲット情報に対応するターゲットタイプ461が範囲変換情報46に存在しない場合、範囲変換部24は、後述するように、ターゲット判定モデル25にセッションデータ41とユーザ属性データ42とページ属性データ43及び投稿者属性データ44を入力して、項目と範囲を生成させる。 When the target type 461 corresponding to the target information of the poster target data 45 does not exist in the range conversion information 46, the range conversion unit 24 adds the session data 41 and the user attribute data 42 to the target determination model 25 as described later. And page attribute data 43 and poster attribute data 44 are input to generate items and ranges.
 ページ属性データ43は、図示はしないが、各コンテンツ210のページ毎に、URLと、コンテンツ210の種類を示すタグと、コンテンツ210を提供する投稿者の識別子を含むテーブルである。なお、ページ属性データ43は、コンテンツ210の使用単語等の静的情報を含めるようにしてもよいし、word2vec等で算出した文章や記事の特徴量を含めるようにしてもよい。 Although not shown, the page attribute data 43 is a table that includes a URL, a tag indicating the type of the content 210, and an identifier of the poster who provides the content 210 for each page of the content 210. The page attribute data 43 may include static information such as words used in the content 210, or may include features of sentences and articles calculated by word2vec or the like.
 投稿者ターゲットデータ45には、図示はしないが、投稿者の識別子と、投稿者が予め選択したターゲット情報が設定される。なお、投稿者ターゲットデータ45のターゲット情報は、上述の範囲変換情報46のターゲットタイプ461の値に対応するが、範囲変換情報46のターゲットタイプ461に含まれない値を設定することができる。また、投稿者ターゲットデータ45には、定性的な情報以外に、項目と値の範囲を含む情報で設定することができる。また、投稿者属性データ44は、図示はしないが、投稿者の識別子と、投稿者の業種と、投稿者が所属する部門を格納する。 Although not shown, the poster target data 45 is set with the poster identifier and the target information selected in advance by the poster. The target information of the poster target data 45 corresponds to the value of the target type 461 of the range conversion information 46 described above, but a value not included in the target type 461 of the range conversion information 46 can be set. Further, the poster target data 45 can be set with information including an item and a range of values in addition to qualitative information. Although not shown, the poster attribute data 44 stores the identifier of the poster, the type of business of the poster, and the department to which the poster belongs.
 <抽出処理>
 以下、ターゲットユーザ特徴抽出サーバ1で行われる処理の一例について説明する。図3は、ターゲットユーザ特徴抽出サーバ1で行われる処理の概要を示す図である。この処理は、ターゲットユーザ特徴抽出サーバ1の利用者の指令に基づいて開始される。
<Extraction process>
Hereinafter, an example of the processing performed by the target user feature extraction server 1 will be described. FIG. 3 is a diagram showing an outline of processing performed by the target user feature extraction server 1. This process is started based on the command of the user of the target user feature extraction server 1.
 処理対象選択部21は、抽出対象の期間と投稿者を受け付ける。なお、上述したように、投稿者が入力されない場合にはウェブサーバ200の全ての投稿者を抽出対象とする。 The processing target selection unit 21 accepts the extraction target period and the poster. As described above, when the contributor is not input, all the contributors of the web server 200 are extracted.
 まず、ターゲット算出部23は、処理対象選択部21から投稿者を受け付けて、投稿者ターゲットデータ45から投稿者毎のターゲットタイプを取得し、範囲変換情報46又はターゲット判定モデル25からターゲット情報に対応する項目と値の範囲を決定する。 First, the target calculation unit 23 receives posters from the processing target selection unit 21, acquires the target type for each poster from the poster target data 45, and corresponds to the target information from the range conversion information 46 or the target determination model 25. Determine the item and value range to be used.
 ターゲット算出部23は、範囲変換部24を用いて、投稿者毎に抽出対象データ50の項目と範囲を決定し、セッション特徴算出部22へ項目を出力し、アクセス特徴抽出部27へ範囲を出力する。 The target calculation unit 23 determines the item and range of the extraction target data 50 for each contributor using the range conversion unit 24, outputs the item to the session feature calculation unit 22, and outputs the range to the access feature extraction unit 27. do.
 上述したように、範囲変換部24は、ターゲット情報に対応するターゲットタイプ461が範囲変換情報46に存在しない場合には、ターゲット判定モデル25へセッションデータ41とユーザ属性データ42とページ属性データ43及び投稿者属性データ44を入力して、抽出対象の項目と範囲を決定させる。 As described above, when the target type 461 corresponding to the target information does not exist in the range conversion information 46, the range conversion unit 24 transfers the session data 41, the user attribute data 42, the page attribute data 43, and the target determination model 25 to the target determination model 25. The poster attribute data 44 is input to determine the item and range to be extracted.
 また、ターゲット情報に対応するターゲットタイプ461が範囲変換情報46に存在しない場合には、範囲変換部24は、ターゲット判定モデル25によって、抽出対象の項目と範囲を生成することで、アクセス特徴抽出部27が当該ターゲット情報に合致するユーザ特徴を抽出することが可能となる。 When the target type 461 corresponding to the target information does not exist in the range conversion information 46, the range conversion unit 24 generates an item and a range to be extracted by the target determination model 25, thereby generating an access feature extraction unit. 27 can extract user features that match the target information.
 ターゲット判定モデル25は、予め機械学習によって生成したモデルである。ターゲットユーザ特徴抽出サーバ1の学習部31は、ユーザ端末100のセッションデータ41とユーザ属性データ42に投稿者属性データ44とページ属性データ43を機械学習することで、ターゲット判定モデル25を生成する。 The target determination model 25 is a model generated in advance by machine learning. The learning unit 31 of the target user feature extraction server 1 generates the target determination model 25 by machine learning the poster attribute data 44 and the page attribute data 43 in the session data 41 and the user attribute data 42 of the user terminal 100.
 セッション特徴算出部22は、処理対象選択部21から受け付けた期間内のセッションデータ41を取得し、セッションデータ41のID411に対応するユーザ属性データ42を取得する。 The session feature calculation unit 22 acquires the session data 41 within the period received from the processing target selection unit 21, and acquires the user attribute data 42 corresponding to the ID 411 of the session data 41.
 セッション特徴算出部22は、ターゲット算出部23から項目を受け付けて、指定された期間内のセッションデータ41とユーザ属性データ42から指定された項目を含む抽出対象データ50を生成する。 The session feature calculation unit 22 receives items from the target calculation unit 23 and generates extraction target data 50 including the items specified from the session data 41 and the user attribute data 42 within the specified period.
 抽出対象データ50の項目は、範囲変換情報46のターゲットタイプ461に対応する範囲462の内容又はターゲット判定モデル25の出力に応じて決定される。生成された抽出対象データ50は、アクセス特徴抽出部27へ出力される。なお、セッション特徴算出部22は、抽出対象の投稿者毎に抽出対象データ50を生成してもよいし、抽出対象の投稿者の全ての項目を含む抽出対象データ50を生成してもよい。 The item of the extraction target data 50 is determined according to the content of the range 462 corresponding to the target type 461 of the range conversion information 46 or the output of the target determination model 25. The generated extraction target data 50 is output to the access feature extraction unit 27. The session feature calculation unit 22 may generate the extraction target data 50 for each poster to be extracted, or may generate the extraction target data 50 including all the items of the poster to be extracted.
 アクセス特徴抽出部27は、ターゲット算出部23から抽出対象の値の範囲を受け付け、セッション特徴算出部22から抽出対象データ50を受け付ける。アクセス特徴抽出部27は、周知又は公知の分析技術を適用して、投稿者毎に抽出対象データ50から指定された範囲462に該当するユーザ特徴を抽出し、セッションの特徴量としてユーザ特徴51を出力する。 The access feature extraction unit 27 receives a range of values to be extracted from the target calculation unit 23, and receives the extraction target data 50 from the session feature calculation unit 22. The access feature extraction unit 27 applies a well-known or known analysis technique to extract user features corresponding to the range 462 specified from the extraction target data 50 for each contributor, and sets the user feature 51 as the feature amount of the session. Output.
 アクセス特徴抽出部27は、例えば、機械学習で生成した特徴抽出モデルを用いる場合では、説明変数として投稿者のターゲットタイプ、目的変数として閲覧数の範囲を用い、ターゲット情報に含まれるユーザ特徴を推測する。 For example, when the feature extraction model generated by machine learning is used, the access feature extraction unit 27 uses the target type of the poster as the explanatory variable and the range of the number of views as the objective variable, and estimates the user features included in the target information. do.
 また、アクセス特徴抽出部27は、投稿者属性データ44とページ属性データ43を取得して、抽出対象データ50に含まれるユーザがアクセスしたページを抽出して、セッションの特徴量を示すページ特徴52として出力する。なお、アクセス特徴抽出部27は、ページ特徴52の抽出についても、上記と同様に機械学習によって推測することができる。なお、アクセス特徴抽出部27は、機械学習モデルに限定されるものではなく、平均値や中央値等の統計値などを適用するようにしてもよい。 Further, the access feature extraction unit 27 acquires the poster attribute data 44 and the page attribute data 43, extracts the page accessed by the user included in the extraction target data 50, and indicates the page feature 52 indicating the feature amount of the session. Is output as. The access feature extraction unit 27 can also estimate the extraction of the page feature 52 by machine learning in the same manner as described above. The access feature extraction unit 27 is not limited to the machine learning model, and may apply statistical values such as an average value and a median value.
 図23は、アクセス特徴抽出部27が抽出したユーザ特徴51と、ページ特徴52の抽出結果画面600の一例を示す図である。また、図24は、アクセス特徴抽出部27が分析したセッションデータ41の一例を示す図である。 FIG. 23 is a diagram showing an example of the user feature 51 extracted by the access feature extraction unit 27 and the extraction result screen 600 of the page feature 52. Further, FIG. 24 is a diagram showing an example of session data 41 analyzed by the access feature extraction unit 27.
 図24において、ユーザ端末100を利用するユーザ1~3が投稿者AのページA1、A2と、投稿者BのページB1にアクセスし、投稿者DのページD1のページ特徴もページA1、A2、B1と同様の例を示す。 In FIG. 24, users 1 to 3 using the user terminal 100 access pages A1 and A2 of poster A and page B1 of poster B, and page features of page D1 of poster D are also pages A1 and A2. An example similar to B1 is shown.
 図23は、抽出対象として投稿者Aのターゲットタイプ461に該当したユーザのユーザ特徴51と、ページ特徴52の抽出結果が表示された例を示す。 FIG. 23 shows an example in which the user feature 51 of the user corresponding to the target type 461 of the poster A and the extraction result of the page feature 52 are displayed as extraction targets.
 投稿者Aのターゲットタイプ461には、図24で示したユーザ1~3が該当した例を示す。ユーザ特徴51としては、ユーザ1~3の業種は金属業界が67%で、材料メーカが33%であること示し、ユーザ1~3のアクセスはリピート回数414が多いことが特徴として抽出される。 The target type 461 of the contributor A shows an example in which users 1 to 3 shown in FIG. 24 correspond. As the user characteristic 51, it is shown that the metal industry accounts for 67% and the material manufacturer accounts for 33% in the industries of users 1 to 3, and the access of users 1 to 3 is extracted as a feature that the number of repeats is 414.
 また、ユーザ1~3がアクセスしたページ特徴52は、ページ属性データ43のタグとして金属と加工が含まれ、セッションデータ41の特徴として平均滞在時間504が長いことが表示される。 Further, the page feature 52 accessed by the users 1 to 3 includes metal and processing as a tag of the page attribute data 43, and it is displayed that the average stay time 504 is long as a feature of the session data 41.
 以上の処理によって、ターゲットユーザ特徴抽出サーバ1は、セッション毎のID411と訪問ページ413と時刻情報(412、415)と、ユーザ属性データ42の業種と、ページ属性データ43のタグと投稿者から、ターゲット情報(投稿者の嗜好)に合致するユーザ特徴を抽出対象データ50から抽出することが可能となる。 By the above processing, the target user feature extraction server 1 is obtained from the ID 411 for each session, the visit page 413, the time information (412, 415), the industry of the user attribute data 42, the tag of the page attribute data 43, and the poster. It is possible to extract user features that match the target information (poster's preference) from the extraction target data 50.
 投稿者のターゲット情報は、例えば、“新規顧客をターゲットとする”という定性的な値であり、このターゲット情報を定量的に変換した項目と範囲は、“投稿者の記事への閲覧数が30以上50未満で、投稿者の属性との距離(類似度)が10以上離れた業種“とする。 The target information of the poster is, for example, a qualitative value of "targeting a new customer", and the item and range obtained by quantitatively converting this target information are "the number of views to the article of the poster is 30". An industry that is less than 50 and has a distance (similarity) of 10 or more from the attributes of the poster.
 アクセス特徴抽出部27で抽出すべきセッションの特徴(ユーザ特徴)は、一つは、投稿者のコンテンツ210に対するユーザの業種毎の訪問数(閲覧数)のデータである。もう一つはユーザの業種の属性同士の距離を示す特徴であり、これはユーザの業種とページ属性データ43のタグに対する訪問数から、類似度を算出した結果を用いることができる。 One of the session features (user features) to be extracted by the access feature extraction unit 27 is the data of the number of visits (views) of the user to the content 210 of the poster for each industry. The other is a feature indicating the distance between the attributes of the user's industry, and for this, the result of calculating the similarity from the number of visits to the tag of the user's industry and the page attribute data 43 can be used.
 そのため、データ加工部30では、ユーザ毎のページの訪問数については、投稿者のコンテンツ210毎にユーザのID411に紐づく属性(業種423)毎に合計数を算出したもの算出する。また、データ加工部30が、距離については、例えば、多次元尺度法等の類似度を算出する手法を用いて特徴量に関する距離を算出して、これらのデータで抽出対象データ50を構成する。 Therefore, the data processing unit 30 calculates the total number of page visits for each user for each poster's content 210 and for each attribute (industry 423) associated with the user's ID 411. Further, regarding the distance, the data processing unit 30 calculates the distance related to the feature amount by using, for example, a method of calculating the similarity such as a multidimensional scaling method, and constitutes the extraction target data 50 with these data.
 このような、抽出対象データ50に対して、アクセス特徴抽出部27は、投稿者の嗜好である”新規顧客をターゲットとする“に合致するセッションの特徴として、ユーザの業種と訪問数をユーザ特徴51として提示することができる。さらに、アクセス特徴抽出部27は、セッションの特徴に含まれるユーザの業種と、投稿者のコンテンツ210のリンク先でセッションデータを絞り込んだときに、ユーザの業種が訪問したページの特徴を抽出してページ特徴52として出力することができる。 With respect to the extraction target data 50, the access feature extraction unit 27 sets the user's industry and the number of visits as the user's characteristics as the characteristics of the session that matches the poster's preference of "targeting new customers". It can be presented as 51. Further, the access feature extraction unit 27 extracts the features of the page visited by the user's industry when the session data is narrowed down by the link destination of the user's industry included in the session features and the content 210 of the poster. It can be output as page feature 52.
 範囲462としては、上記の他に、ユーザ属性データ42の業種423を用いて、訪問ページ413にアクセスした複数のユーザ間の業種の特徴量(類似度)の距離を算出しておくことで、アクセス特徴抽出部27では、コンテンツ210を投稿者毎に、距離に応じたユーザのグループをユーザ特徴51として提示することができる。 As the range 462, in addition to the above, the distance of the feature amount (similarity) of the industry among a plurality of users who have accessed the visit page 413 is calculated by using the industry 423 of the user attribute data 42. The access feature extraction unit 27 can present the content 210 as the user feature 51 as a group of users according to the distance for each poster.
 図8は、図3に示したセッション特徴算出部22で行われる処理の一例を示すフローチャートである。セッション特徴算出部22は、処理対象選択部21から期間を受け付け、ターゲット算出部23から項目を受け付けると、以下の処理を行う。 FIG. 8 is a flowchart showing an example of processing performed by the session feature calculation unit 22 shown in FIG. When the session feature calculation unit 22 receives a period from the processing target selection unit 21 and receives an item from the target calculation unit 23, the session feature calculation unit 22 performs the following processing.
 セッション特徴算出部22は、受け付けた期間内のデータをセッションデータ41から取得する(S1)。次に、セッション特徴算出部22は、指定期間内のセッションデータ41に含まれるユーザ(ユーザ端末100)のユーザ属性データ42を取得する(S2)。 The session feature calculation unit 22 acquires the data within the received period from the session data 41 (S1). Next, the session feature calculation unit 22 acquires the user attribute data 42 of the user (user terminal 100) included in the session data 41 within the designated period (S2).
 セッション特徴算出部22は、上記ステップS1で取得したセッションデータ41に、ユーザID411、421が一致するユーザ属性データ42を結合して結合データを生成する(S3)。 The session feature calculation unit 22 combines the session data 41 acquired in step S1 with the user attribute data 42 in which the user IDs 411 and 421 match to generate the combined data (S3).
 セッション特徴算出部22は、ターゲット算出部23から受け付けた項目が、上記ステップS3で生成した結合データに含まれるか否かを判定する(S4)。セッション特徴算出部22は、結合データに抽出対象の項目が全て含まれている場合には、当該結合したデータを抽出対象データ50としてそのまま出力する。一方、セッション特徴算出部22は、結合データに抽出対象の項目が全て含まれていない場合には、ステップS5に進んでデータ加工部30によって、受け付けた項目のデータを結合データから生成する。 The session feature calculation unit 22 determines whether or not the item received from the target calculation unit 23 is included in the combined data generated in step S3 (S4). When the combined data includes all the items to be extracted, the session feature calculation unit 22 outputs the combined data as the extraction target data 50 as it is. On the other hand, when the session feature calculation unit 22 does not include all the items to be extracted in the combined data, the session feature calculation unit 22 proceeds to step S5 and generates data of the received items from the combined data by the data processing unit 30.
 データ加工部30は、ターゲット算出部23が決定した抽出対象の項目のデータを、ユーザ毎に結合データから生成する。 The data processing unit 30 generates data of the items to be extracted determined by the target calculation unit 23 from the combined data for each user.
 例えば、項目が平均滞在時間の場合、データ加工部30は、セッションデータ41のID411と訪問ページ413が一致するレコードについて離脱時刻415とアクセス時刻412の差分を算出し、同一の訪問ページ413の平均値を平均滞在時間として算出する。また、データ加工部30は、ページ属性データ43を参照して各訪問ページ413の投稿者(識別子)を特定し、投稿者毎に平均滞在時間を算出してもよい。 For example, when the item is the average stay time, the data processing unit 30 calculates the difference between the departure time 415 and the access time 412 for the record in which the ID 411 of the session data 41 and the visit page 413 match, and averages the same visit page 413. Calculate the value as the average staying time. Further, the data processing unit 30 may specify the contributor (identifier) of each visit page 413 with reference to the page attribute data 43, and calculate the average staying time for each contributor.
 次に、セッション特徴算出部22は、ステップS6で上記項目毎に生成したデータを抽出対象データ50としてアクセス特徴抽出部27へ出力する。 Next, the session feature calculation unit 22 outputs the data generated for each of the above items in step S6 to the access feature extraction unit 27 as the extraction target data 50.
 以上の処理によって、セッション特徴算出部22は、指定された期間内のセッションデータ41とユーザ属性データ42から、ターゲット情報の判定に使用する項目のデータを算出して抽出対象データ50として出力する。 Through the above processing, the session feature calculation unit 22 calculates the data of the items used for determining the target information from the session data 41 and the user attribute data 42 within the designated period, and outputs the data as the extraction target data 50.
 図9は、図3に示したターゲット算出部23で行われる処理の一例を示すフローチャートである。ターゲット算出部23は、処理対象選択部21から投稿者を受け付けて以下の処理を開始する。 FIG. 9 is a flowchart showing an example of processing performed by the target calculation unit 23 shown in FIG. The target calculation unit 23 receives a poster from the processing target selection unit 21 and starts the following processing.
 ターゲット算出部23は、受け付けた投稿者について、投稿者ターゲットデータ45からターゲット情報を取得する(S11)。ターゲット算出部23は、取得したターゲット情報が項目と値の範囲(又は閾値)を含む情報であるか否かを判定する(S12)。項目と範囲を含む場合にはステップS14に進み、そうでない場合にはステップS13に進む。 The target calculation unit 23 acquires target information from the poster target data 45 for the accepted poster (S11). The target calculation unit 23 determines whether or not the acquired target information is information including an item and a range (or a threshold value) of a value (S12). If the item and range are included, the process proceeds to step S14, and if not, the process proceeds to step S13.
 ステップS13では、ターゲット情報が定性的な情報の場合であり、この場合、ターゲット算出部23は、範囲変換部24を使用して定性的な情報を項目と範囲に変換する。そして、ステップS14では、変換された項目と値の範囲をセッション特徴算出部22とアクセス特徴抽出部27へ出力する。 In step S13, the target information is qualitative information. In this case, the target calculation unit 23 uses the range conversion unit 24 to convert the qualitative information into items and ranges. Then, in step S14, the converted item and the range of values are output to the session feature calculation unit 22 and the access feature extraction unit 27.
 図10は、ターゲット算出部23の範囲変換部24で行われる処理の一例を示すフローチャートである。ターゲット算出部23は、ターゲット情報に対応する範囲変換情報46が存在するか否かを判定する(S21)。範囲変換情報46が存在する場合にはステップS22へ進み、存在しない場合にはステップS23へ進む。 FIG. 10 is a flowchart showing an example of processing performed by the range conversion unit 24 of the target calculation unit 23. The target calculation unit 23 determines whether or not the range conversion information 46 corresponding to the target information exists (S21). If the range conversion information 46 exists, the process proceeds to step S22, and if the range conversion information 46 does not exist, the process proceeds to step S23.
 ステップS22では、範囲変換部24が、範囲変換情報46を参照して、ターゲット情報に対応するターゲットタイプ461から範囲462を取得し、範囲462に設定されている項目と値の範囲を決定する。 In step S22, the range conversion unit 24 refers to the range conversion information 46, acquires the range 462 from the target type 461 corresponding to the target information, and determines the range of items and values set in the range 462.
 ステップS23では、範囲変換部24が、ターゲット判定モデル25にターゲット判定モデル25へセッションデータ41とユーザ属性データ42とページ属性データ43及び投稿者属性データ44を入力して、抽出対象の項目と範囲を決定させる。 In step S23, the range conversion unit 24 inputs the session data 41, the user attribute data 42, the page attribute data 43, and the poster attribute data 44 into the target determination model 25 in the target determination model 25, and the item and range to be extracted. To decide.
 以上の処理によって、投稿者ターゲットデータ45のターゲット情報が定性的な情報の場合には、範囲変換情報46又はターゲット判定モデル25によって抽出対象の項目と値の範囲が決定される。 By the above processing, when the target information of the poster target data 45 is qualitative information, the range conversion information 46 or the target determination model 25 determines the items to be extracted and the range of values.
 図11は、ターゲット算出部23の範囲変換部24で行われるターゲット判定項目加工部28の一例を示す図である。範囲変換部24は、ターゲット判定モデル25を使用する場合、セッションデータ41とユーザ属性データ42のユーザデータ510に対してターゲット判定項目加工部28で処理を行って投稿者のコンテンツ210(ページ)毎に後述する統計処理を行う(S231)。なお、セッションデータ41は、処理対象選択部21から受け付けた期間内のデータである。 FIG. 11 is a diagram showing an example of the target determination item processing unit 28 performed by the range conversion unit 24 of the target calculation unit 23. When the target determination model 25 is used, the range conversion unit 24 processes the user data 510 of the session data 41 and the user attribute data 42 by the target determination item processing unit 28 for each content 210 (page) of the poster. The statistical processing described later is performed (S231). The session data 41 is data within the period received from the processing target selection unit 21.
 次に、範囲変換部24は、ターゲット判定項目加工の処理結果に、ページ属性データ43と投稿者属性データ44と投稿者ターゲットデータ45を含む投稿者データ520を結合させる(S232)。なお、ページ属性データ43は、処理対象選択部21から受け付けた期間内のセッションデータ41に含まれる訪問ページ413に該当するデータを使用する。 Next, the range conversion unit 24 combines the page attribute data 43, the poster attribute data 44, and the poster data 520 including the poster target data 45 with the processing result of the target determination item processing (S232). The page attribute data 43 uses the data corresponding to the visit page 413 included in the session data 41 within the period received from the processing target selection unit 21.
 そして、ユーザデータ510のターゲット判定項目加工処理結果と投稿者データ520を結合したデータを、ターゲット判定モデル25へ与えて、抽出対象の項目と値の範囲を決定させる。 Then, the data obtained by combining the target determination item processing result of the user data 510 and the poster data 520 is given to the target determination model 25 to determine the item to be extracted and the range of values.
 図12は、ターゲット判定項目加工部28で行われる処理の一例を示すフローチャートである。この処理は、上記図11のステップS231で実行される。 FIG. 12 is a flowchart showing an example of processing performed by the target determination item processing unit 28. This process is executed in step S231 of FIG. 11 above.
 ターゲット判定項目加工部28は、図11に示したユーザデータ510を取得する(S32)。ターゲット判定項目加工部28は、ユーザ属性データ42を利用するか否かを判定する(S32)。ユーザ属性データ42の利用の有無は、例えば、投稿者ターゲットデータ45で投稿者の識別子毎に予め設定しておくことができる。 The target determination item processing unit 28 acquires the user data 510 shown in FIG. 11 (S32). The target determination item processing unit 28 determines whether or not to use the user attribute data 42 (S32). Whether or not to use the user attribute data 42 can be set in advance for each poster identifier in the poster target data 45, for example.
 ターゲット判定項目加工部28は投稿者ターゲットデータ45を参照してユーザ属性データ42を利用する場合にはステップS33へ進み、利用しない場合にはステップS36へ進む。 The target determination item processing unit 28 refers to the poster target data 45 and proceeds to step S33 when using the user attribute data 42, and proceeds to step S36 when not using it.
 ステップS33では、ターゲット判定項目加工部28が、ユーザ属性データ42の業種423とセッションデータ41の訪問ページ413とページ属性データ43のタグを取得して、業種423の特徴量を算出する。そして、ターゲット判定項目加工部28は、算出された特徴量の空間で、ユーザの業種423間の距離を多次元尺度解析法(MDS:Multi-Dimensional Scaling)等を用いて算出し、この距離を類似度とする。 In step S33, the target determination item processing unit 28 acquires the tags of the industry 423 of the user attribute data 42, the visit page 413 of the session data 41, and the page attribute data 43, and calculates the feature amount of the industry 423. Then, the target determination item processing unit 28 calculates the distance between the user's industry 423s in the space of the calculated feature amount by using a multidimensional scaling (MDS: Multi-Dimensional Scaling) or the like, and calculates this distance. Let it be similar.
 この処理は、図19で示すように、ページ属性データ43のタグ毎に、ユーザ属性データ42の業種423毎の閲覧数を投稿者毎に集計し、閲覧数データ530を生成する。図19の閲覧数データ530は、コンテンツ210のタグ毎に、ユーザ属性データ42の業種423毎の閲覧数の合計値を算出した情報である。 As shown in FIG. 19, this process aggregates the number of views of the user attribute data 42 for each industry 423 for each tag of the page attribute data 43 for each contributor, and generates the number of views data 530. The number of views data 530 in FIG. 19 is information obtained by calculating the total number of views of the user attribute data 42 for each industry 423 for each tag of the content 210.
 図19では、投稿者Aのコンテンツ210のタグAのページについて、業種a~業種dのユーザのそれぞれの閲覧数の集計値が格納されている。図19の閲覧数データ530は、投稿者Aのタグ毎に、ユーザの業種423による興味の多寡を表現することができる。 In FIG. 19, for the page of tag A of the content 210 of the poster A, the aggregated value of the number of views of each of the users of the industry a to the industry d is stored. The number of views data 530 in FIG. 19 can express the amount of interest of the user's industry 423 for each tag of the poster A.
 ターゲット判定項目加工部28は、図19の閲覧数データ530から多次元尺度解析法を用いて、特徴量1、特徴量2を算出し、図20で示すように特徴量1、特徴量2の空間上に業種423が配置される。なお、図20は、特徴量1、2で表される業種423間の距離を類似度として表現したマップである。なお、図示の例では、投稿者Aのコンテンツ210に対する閲覧数データ530から類似度を算出した例を示す。 The target determination item processing unit 28 calculates the feature amount 1 and the feature amount 2 from the browsing number data 530 of FIG. 19 by using the multidimensional scaling analysis method, and as shown in FIG. 20, the feature amount 1 and the feature amount 2 Industry 423 is arranged in the space. Note that FIG. 20 is a map expressing the distance between the industry 423 represented by the features 1 and 2 as the degree of similarity. In the illustrated example, an example in which the degree of similarity is calculated from the number of views data 530 for the content 210 of the poster A is shown.
 次に、図12のステップS34では、ターゲット判定項目加工部28が、セッションの特徴を利用するか否かを判定する。セッションの特徴の利用の有無は、例えば、投稿者ターゲットデータ45で投稿者の識別子毎に予め設定しておくことができる。 Next, in step S34 of FIG. 12, the target determination item processing unit 28 determines whether or not to utilize the characteristics of the session. Whether or not to use the session feature can be set in advance for each poster identifier in the poster target data 45, for example.
 ターゲット判定項目加工部28は投稿者ターゲットデータ45を参照してセッションの特徴を利用する場合にはステップS35へ進み、利用しない場合には処理を終了する。ステップS35では、ターゲット判定項目加工部28が、投稿者のページ毎にユーザデータ510と投稿者データ520の統計処理を実施する。図21は統計処理によって生成された統計データ540の一例を示す図である。 The target determination item processing unit 28 refers to the poster target data 45 and proceeds to step S35 when using the characteristics of the session, and ends the process when not using it. In step S35, the target determination item processing unit 28 performs statistical processing of the user data 510 and the poster data 520 for each page of the poster. FIG. 21 is a diagram showing an example of statistical data 540 generated by statistical processing.
 統計データは、投稿者毎のコンテンツ210に対するユーザの業種423別の閲覧数をターゲット判定項目加工部28が集計した結果を示す図である。図21では、投稿者Aのコンテンツ210について、業種a~業種dのユーザのそれぞれの閲覧数の集計値が格納されている。図21の統計データ540は、投稿者毎に、ユーザの業種423による興味の多寡を表現することができる。 The statistical data is a diagram showing the result of the target determination item processing unit 28 totaling the number of views of the content 210 for each contributor by the user's industry 423. In FIG. 21, for the content 210 of the poster A, the aggregated value of the number of views of each of the users of the industry a to the industry d is stored. The statistical data 540 of FIG. 21 can express the amount of interest of the user's industry 423 for each contributor.
 図22は、統計処理の結果を図21のマップに加えた類似度マップの一例を示す図である。図中、業種毎の円の大きさは、投稿者Aに対する各業種のユーザの閲覧数に比例する。 FIG. 22 is a diagram showing an example of a similarity map in which the result of statistical processing is added to the map of FIG. 21. In the figure, the size of the circle for each industry is proportional to the number of views of users in each industry for poster A.
 以上のように、ターゲット判定項目加工部28は、ユーザ属性データ42とページ属性データ43で統計処理を行った業種423間の距離を投稿者毎に集計した情報を出力する。 As described above, the target determination item processing unit 28 outputs information that aggregates the distance between the industry 423 that has been statistically processed by the user attribute data 42 and the page attribute data 43 for each contributor.
 なお、ユーザ属性データ42を利用しない場合のステップS36では、セッション特徴算出部22が使用したデータ加工部30で、訪問ページ413毎の滞在時間等のデータ処理を実施して、ターゲット判定モデル25へ出力する。 In step S36 when the user attribute data 42 is not used, the data processing unit 30 used by the session feature calculation unit 22 performs data processing such as the staying time for each visit page 413 to reach the target determination model 25. Output.
 以上のように、ターゲット算出部23がターゲット判定モデル25を使用する場合には、図10~図12の処理で生成したデータをターゲット判定モデル25へ入力することにより、範囲変換情報46がない場合においても項目と値の範囲を決定することができる。 As described above, when the target calculation unit 23 uses the target determination model 25, there is no range conversion information 46 by inputting the data generated by the processes of FIGS. 10 to 12 into the target determination model 25. Also, the range of items and values can be determined.
 <学習処理>
 次に、学習部31で行われるターゲット判定モデル25を構築するための学習処理について説明する。図13は、ユーザ属性データ42や投稿者属性データ44を利用しない場合にターゲット判定モデル25の学習を行うデータを定義する選択データ550の一例を示す図である。
<Learning process>
Next, the learning process for constructing the target determination model 25 performed by the learning unit 31 will be described. FIG. 13 is a diagram showing an example of selection data 550 that defines data for learning the target determination model 25 when the user attribute data 42 and the poster attribute data 44 are not used.
 選択データ550は、ID5501と、ターゲット顧客5502と、平均滞在時間5503と、閲覧数5504をひとつのレコードに含むテーブルである。ID5501には、投稿者の識別子が格納される。ターゲット顧客5502には、各投稿者が選択したターゲットタイプが格納される。なお、ターゲットタイプは、予め設定された定性的な情報から投稿者毎に選択させればよい。 The selection data 550 is a table that includes the ID 5501, the target customer 5502, the average stay time 5503, and the number of views 5504 in one record. The contributor's identifier is stored in ID5501. The target customer 5502 stores the target type selected by each contributor. The target type may be selected for each contributor from preset qualitative information.
 平均滞在時間5503には、当該ID5501の投稿者が提供するページにユーザが滞在(閲覧)した平均時間の条件が格納される。閲覧数5504には、当該ID5501の投稿者が提供するページをユーザが閲覧した総数の条件が格納される。 In the average stay time 5503, the condition of the average stay time that the user stays (views) is stored in the page provided by the poster of the ID 5501. The number of views 5504 stores the condition of the total number of views by the user on the page provided by the poster of the ID 5501.
 上記選択データ550は、ターゲットユーザ特徴抽出サーバ1の管理者が、投稿者から受け付けたターゲットタイプに基づいて生成してもよいし、投稿端末300から入力させてもよい。 The selection data 550 may be generated by the administrator of the target user feature extraction server 1 based on the target type received from the poster, or may be input from the posting terminal 300.
 図示の例では、ターゲット判定モデル25を構築するためのターゲットタイプとして、新規顧客を優先する「新規」と、既存顧客を優先する「既存」の2種類とし、学習処理の項目として、平均滞在時間5503と閲覧数5504を使用する例を示す。 In the illustrated example, there are two types of target types for constructing the target determination model 25: "new" that prioritizes new customers and "existing" that prioritizes existing customers. An example of using 5503 and 5504 views is shown.
 図13の選択データ550を、平均滞在時間5503と閲覧数5504の空間で投稿者が選択した領域は、図14のようになる。図14は、選択データ550の一例を示すグラフである。 The area selected by the poster in the space of the average stay time 5503 and the number of views 5504 of the selection data 550 in FIG. 13 is as shown in FIG. FIG. 14 is a graph showing an example of selection data 550.
 図14では、「既存」を選択した投稿者A、Cがターゲットとするユーザ特徴の領域を実線で示し、「新規」を選択した投稿者B、Dがターゲットとするユーザ特徴の領域を破線で示した。 In FIG. 14, the area of the user feature targeted by the posters A and C who selected "existing" is shown by a solid line, and the area of the user feature targeted by the posters B and D who selected "new" is shown by a broken line. Indicated.
 学習部31は、選択データ550で設定された条件の学習用データを生成してターゲット判定モデル25に与えて学習させる。なお、ターゲット判定モデル25に与える学習用データは、実際のセッションデータ41とユーザ属性データ42から生成してもよいが、ダミーデータを使用してもよい。 The learning unit 31 generates learning data under the conditions set in the selection data 550 and gives it to the target determination model 25 for learning. The learning data given to the target determination model 25 may be generated from the actual session data 41 and the user attribute data 42, but dummy data may be used.
 なお、ターゲットタイプ(新規又は既存)のセッションの特徴は、実際のデータを加工したものである必要はなく、ダミーデータでいくつかセッションの特徴を示して、複数の投稿者にターゲットタイプを選択して試行させ、試行の結果を保持したデータを用いることができる。また、ターゲットタイプに応じた領域は、予めターゲットタイプの特徴を選択データ550の項目に変換したもので、ターゲットタイプ毎にどの項目が選択されたかを図14のグラフのように出力してもよい。 The characteristics of the session of the target type (new or existing) do not have to be processed from the actual data, and some characteristics of the session are shown by dummy data, and the target type is selected for multiple contributors. It is possible to use the data in which the results of the trials are retained. Further, the area corresponding to the target type is obtained by converting the characteristics of the target type into the items of the selection data 550 in advance, and which item is selected for each target type may be output as shown in the graph of FIG. ..
 なお、学習用のデータを図15A、図15Bで示すカテゴリテーブル560と条件テーブル570で定義してもよい。図15Aは投稿者のターゲットタイプ(嗜好)を反映するカテゴリテーブル560の一例を示す。図15Bは、カテゴリ毎の項目と値の範囲を設定する条件テーブルの一例を示す。 Note that the learning data may be defined in the category table 560 and the condition table 570 shown in FIGS. 15A and 15B. FIG. 15A shows an example of a category table 560 that reflects the poster's target type (preference). FIG. 15B shows an example of a condition table for setting an item and a range of values for each category.
 図15Aのカテゴリテーブル560は、ID5601と、ターゲット顧客5602と、カテゴリ番号5603をひとつのレコードに含む。ID5601には、投稿者の識別子が格納される。ターゲット顧客5602には、各投稿者が選択したターゲットタイプが格納される。なお、ターゲットタイプは、予め設定された定性的な情報から投稿者毎に選択させればよい。カテゴリ番号5603には、各投稿者が選択したセッションの特徴を示す領域の番号が設定される。なお、カテゴリ番号5603は、予め設定された番号の中から、投稿者が選択した番号が格納される。 The category table 560 of FIG. 15A includes the ID 5601, the target customer 5602, and the category number 5603 in one record. The contributor's identifier is stored in ID5601. The target customer 5602 stores the target type selected by each contributor. The target type may be selected for each contributor from preset qualitative information. In the category number 5603, the number of the area indicating the characteristics of the session selected by each contributor is set. The category number 5603 stores a number selected by the poster from the preset numbers.
 図15Bの条件テーブル570は、ターゲット顧客5701と、平均滞在時間5702と、閲覧数5703をひとつのレコードに含む。ターゲット顧客5701には、カテゴリテーブル560のカテゴリ番号5603に対応する番号が格納される。 The condition table 570 of FIG. 15B includes the target customer 5701, the average stay time 5702, and the number of views 5703 in one record. The target customer 5701 stores a number corresponding to the category number 5603 of the category table 560.
 平均滞在時間5702には、投稿者が提供するページにユーザが滞在(閲覧)した平均時間に関する条件が格納される。閲覧数5703には、投稿者が提供するページをユーザが閲覧した総数に関する条件が格納される。 The average stay time 5702 stores conditions related to the average stay time (viewed) by the user on the page provided by the poster. The number of views 5703 stores a condition regarding the total number of pages viewed by the user provided by the poster.
 上記選択データ550は、ターゲットユーザ特徴抽出サーバ1の管理者が、投稿者から受け付けたターゲットタイプに基づいて生成してもよいし、投稿端末300から入力させてもよい。 The selection data 550 may be generated by the administrator of the target user feature extraction server 1 based on the target type received from the poster, or may be input from the posting terminal 300.
 図15Aの例では、ターゲットタイプとして「既存」を選択した投稿者A、Cは、カテゴリ番号5603=「1」を選択し、「新規」を選択した投稿者B、Dは、それぞれカテゴリ番号5603=2、3を選択した例を示す。 In the example of FIG. 15A, the contributors A and C who selected "existing" as the target type selected the category number 5603 = "1", and the contributors B and D who selected "new" each had the category number 5603. An example in which = 2 and 3 are selected is shown.
 カテゴリ番号5603に対応する領域は、条件テーブル570の平均滞在時間5702と閲覧数5703で制限され、図16で示すような領域となる。図16において、閲覧数5703に関わらず、平均滞在時間が100時間未満のデータはカテゴリ「2」となる。また、閲覧数5703が50未満で、平均滞在時間が100時間未満のデータはカテゴリ「3」となり、その他の領域がカテゴリ「1」となる。 The area corresponding to the category number 5603 is limited by the average stay time 5702 and the number of views 5703 of the condition table 570, and is as shown in FIG. In FIG. 16, data having an average stay time of less than 100 hours is classified into category “2” regardless of the number of views 5703. Further, the data in which the number of views 5703 is less than 50 and the average staying time is less than 100 hours is classified into category "3", and the other areas are classified into category "1".
 以上のように、投稿者の嗜好を格納するカテゴリテーブル560と、データの範囲を決定する条件テーブル570によって、学習用のデータを生成してもよい。 As described above, the data for learning may be generated by the category table 560 that stores the preference of the poster and the condition table 570 that determines the range of the data.
 次に、ターゲット判定モデル25の学習用のデータを決定する際に、ユーザ属性データ42と投稿者属性データ44を利用して、上記図21、図22と同様に投稿者とユーザの業種間の距離を用いる例を以下に示す。 Next, when determining the learning data of the target determination model 25, the user attribute data 42 and the poster attribute data 44 are used to be used between the poster and the user's industry in the same manner as in FIGS. 21 and 22 above. An example using distance is shown below.
 図17は、ターゲット判定モデル25の学習用のデータを定義する選択データ580の一例を示す図である。選択データ580は、ID5801と、ターゲット顧客5802と、業種5803と、選択業種5804と、距離5805と、閲覧数5806をひとつのレコードに含むテーブルである。ID5801には、投稿者の識別子が格納される。ターゲット顧客5802には、各投稿者が選択したターゲットタイプが格納される。なお、ターゲットタイプは、予め設定された定性的な情報から投稿者毎に選択させればよい。 FIG. 17 is a diagram showing an example of selection data 580 that defines data for learning of the target determination model 25. The selection data 580 is a table that includes the ID 5801, the target customer 5802, the industry 5803, the selected industry 5804, the distance 5805, and the number of views 5806 in one record. The identifier of the poster is stored in the ID 5801. The target customer 5802 stores the target type selected by each contributor. The target type may be selected for each contributor from preset qualitative information.
 業種5803は、投稿者属性データ44に設定された投稿者の業種が格納される。選択業種5804には、投稿者が選択したユーザの業種が格納される。距離5805には、投稿者とユーザの業種間の類似度の距離が格納される。閲覧数5806には、当該ID5801の投稿者が提供するページをユーザが閲覧した総数が格納される。 The industry 5803 stores the industry of the poster set in the poster attribute data 44. In the selected industry 5804, the industry of the user selected by the poster is stored. The distance 5805 stores the distance of the degree of similarity between the poster and the user's industry. The number of views 5806 stores the total number of pages viewed by the user provided by the poster of the ID 5801.
 上記選択データ580は、ターゲットユーザ特徴抽出サーバ1の管理者が、投稿者から受け付けたターゲットタイプに基づいて生成してもよいし、投稿端末300から入力させてもよい。 The selection data 580 may be generated based on the target type received from the poster by the administrator of the target user feature extraction server 1, or may be input from the posting terminal 300.
 図示の例では、ID5801=投稿者Aのレコードでは、ターゲット顧客5802として「既存」を選択し、投稿者Aの業種5803=aで、投稿者Aが選択したターゲットユーザの業種5804=bで、業種間の距離5805=Labで、閲覧数5806が100以上のデータが学習用のデータとして定義されている。 In the illustrated example, in the record of ID5801 = contributor A, "existing" is selected as the target customer 5802, the industry of poster A is 5803 = a, and the industry of the target user selected by poster A is 5804 = b. Data with a distance of 5805 = Lab between industries and a number of views of 5806 of 100 or more is defined as learning data.
 図18は、上記図17に示した選択データ5801の投稿者Aの業種5803=aと、選択業種5804間の類似度のマップを示す。図中、業種毎の円の大きさは、投稿者Aのコンテンツ210に対する各業種a~dのユーザの閲覧数に比例する。 FIG. 18 shows a map of the degree of similarity between the industry 5803 = a of the poster A of the selection data 5801 shown in FIG. 17 and the selection industry 5804. In the figure, the size of the circle for each industry is proportional to the number of views of the users of each industry a to d for the content 210 of the poster A.
 この例では、セッションデータ41とユーザ属性データ42及び投稿者属性データ44から業種間の類似度を算出し、その類似度を参照して、投稿者の属性とターゲットタイプから、選択した業種に対する距離と業種に関する情報を抽出する。 In this example, the similarity between industries is calculated from the session data 41, the user attribute data 42, and the poster attribute data 44, and the distance to the selected industry is calculated from the poster attributes and the target type with reference to the similarity. And extract information about the industry.
 なお、図示の例では、ユーザの属性の類似度を、投稿者の属性の類似度として適用しているが、これは、ユーザであっても投稿者であっても業種が似ていれば、興味あるタグへの行動は似ているという前提を使用している。 In the illustrated example, the similarity of the user's attributes is applied as the similarity of the poster's attributes. However, if the type of business is similar regardless of whether the user or the poster is used, We use the assumption that the behavior for the tag of interest is similar.
 また、例えば、ユーザの検索単語に対するセッションデータ41(アクセス履歴)から類似度を算出するなど、タグへのデータを使用するだけに限定されるものではない。これらから抽出したデータを用いて、説明変数を属性とターゲットタイプ又は嗜好とし、目的変数を属性間の距離と訪問数(閲覧数)として、選択された項目をターゲット判定モデル25に学習させる。学習手法としては、例えば、RandomForestのような機械学習手法を使用することができる。 Further, for example, the similarity is calculated from the session data 41 (access history) for the user's search word, and the like is not limited to using the data to the tag. Using the data extracted from these, the target determination model 25 is made to learn the selected items, with the explanatory variables as attributes and target types or preferences, and the objective variables as the distance between attributes and the number of visits (number of views). As the learning method, for example, a machine learning method such as Random Forest can be used.
 <結び>
 以上のように、本実施例のターゲットユーザ特徴抽出サーバ1は、セッションデータ41とユーザ属性データ42とページ属性データ43及び投稿者属性データ44から、投稿者が希望するターゲットタイプに基づいて抽出対象データ50の項目と値の範囲を決定して抽出対象データ50を生成する。そして、アクセス特徴抽出部27へ値の範囲と抽出対象データ50を入力することで、ウェブサーバ200にコンテンツ210を提供する投稿者が獲得したいユーザ特徴を、ウェブサーバ200にアクセスしたユーザ(ユーザ端末100)の履歴から抽出することが可能となる。また、ターゲットユーザ特徴抽出サーバ1は、投稿者の意図とは異なる新規のユーザも抽出することが可能となって、新たなビジネスの創出することも可能となる。
<Conclusion>
As described above, the target user feature extraction server 1 of this embodiment extracts from the session data 41, the user attribute data 42, the page attribute data 43, and the poster attribute data 44 based on the target type desired by the poster. The items and the range of values of the data 50 are determined to generate the data 50 to be extracted. Then, by inputting the value range and the extraction target data 50 into the access feature extraction unit 27, the user who accessed the web server 200 (user terminal) obtains the user features that the poster who provides the content 210 to the web server 200 wants to acquire. It is possible to extract from the history of 100). In addition, the target user feature extraction server 1 can extract new users who are different from the poster's intention, and can also create a new business.
 また、ターゲットユーザ特徴抽出サーバ1は、抽出したユーザ特徴のセッションデータ41の特徴をページ属性データ43から抽出できるため、投稿者のコンテンツ210のどのような内容(タグ)にユーザが興味を示したかを絞ることができ、マーケティングを支援することが可能となる。 Further, since the target user feature extraction server 1 can extract the features of the session data 41 of the extracted user features from the page attribute data 43, what kind of content (tag) of the poster's content 210 shows the user's interest. Can be narrowed down and marketing can be supported.
 なお、上記実施例では、ユーザ属性データ42としてユーザの業種と、投稿者属性データ44として投稿者の業種を用いる例を示したが、これに限定されるものではない。例えば、ユーザの趣味や趣向と、投稿者の趣味や趣向を属性データとして利用することができ、このような属性データからターゲットユーザ特徴を抽出することができる。 In the above embodiment, the user's industry is used as the user attribute data 42, and the poster's industry is used as the poster attribute data 44, but the present invention is not limited to this. For example, the hobbies and tastes of the user and the hobbies and tastes of the poster can be used as attribute data, and the target user characteristics can be extracted from such attribute data.
 また、抽出対象データ50の項目と、値の範囲を決定する場合には、投稿者の嗜好を反映したターゲットタイプに対応する範囲変換情報46が存在しない場合でも、ターゲット判定モデル25を利用することで、投稿者が獲得したいターゲットタイプのユーザ特徴をセッションデータ41等から抽出することが可能となる。 Further, when determining the items of the extraction target data 50 and the range of values, the target determination model 25 should be used even if the range conversion information 46 corresponding to the target type reflecting the preference of the poster does not exist. Therefore, it is possible to extract the user characteristics of the target type that the poster wants to acquire from the session data 41 and the like.
 以上のように、上記実施例のターゲットユーザ特徴抽出サーバ1は、以下のような構成とすることができる。 As described above, the target user feature extraction server 1 of the above embodiment can have the following configuration.
 (1)プロセッサ11とメモリ12を有する計算機(1)が、ウェブサーバ(200)のコンテンツ(210)にアクセスした履歴情報(セッションデータ41)から投稿者が獲得目標とするユーザ特徴を抽出するターゲットユーザ特徴抽出方法であって、 (1) A target in which a computer (1) having a processor 11 and a memory 12 extracts a user feature to be acquired by a poster from history information (session data 41) that accesses the content (210) of the web server (200). It is a user feature extraction method
 前記計算機が、前記ウェブサーバ(200)のコンテンツ(210)にアクセスしたユーザ端末(100)の履歴情報を格納したセッションデータ(41)と、前記ユーザ端末(100)を利用するユーザの属性情報を格納したユーザ属性データ(42)と、をユーザデータ(510)として取得するユーザデータ取得ステップと、前記計算機(1)が、前記コンテンツ(210)の属性を格納したページ属性データ(43)と、前記コンテンツ(210)を提供した前記投稿者の属性を格納した投稿者属性データ(44)と、を投稿者データ(520)として取得する投稿者データ取得ステップと、前記計算機(1)が、抽出対象とする投稿者を受け付けて、前記投稿者が獲得目標とするユーザの情報をターゲットタイプ(461)として取得する嗜好取得ステップと、前記計算機(1)が、前記投稿者のターゲットタイプ(461)から抽出対象のデータの項目と前記項目の値の範囲を算出するターゲット算出ステップ(ターゲット算出部23)と、前記計算機(1)が、前記ユーザデータ(510)と前記投稿者データ(520)から前記項目に対応する抽出対象データを算出するセッション特徴算出ステップ(セッション特徴算出部22)と、前記計算機(1)が、前記抽出対象データと前記投稿者データ(520)から前記項目の値の範囲に基づいてアクセスの特徴量を算出するアクセス特徴抽出ステップ(アクセス特徴抽出部27)と、を含むことを特徴とするターゲットユーザ特徴抽出方法。 Session data (41) in which the computer stores the history information of the user terminal (100) that has accessed the content (210) of the web server (200), and the attribute information of the user who uses the user terminal (100). A user data acquisition step of acquiring the stored user attribute data (42) as user data (510), a page attribute data (43) in which the computer (1) stores the attributes of the content (210), and the like. The poster data acquisition step of acquiring the poster attribute data (44) storing the attributes of the poster who provided the content (210) as the poster data (520), and the computer (1) extract the data. The preference acquisition step of accepting the target contributor and acquiring the user information targeted by the contributor as the target type (461), and the computer (1) are the target type (461) of the contributor. The target calculation step (target calculation unit 23) for calculating the item of the data to be extracted from and the range of the value of the item, and the computer (1) from the user data (510) and the poster data (520). The session feature calculation step (session feature calculation unit 22) for calculating the extraction target data corresponding to the item, and the computer (1) range the value of the item from the extraction target data and the poster data (520). A target user feature extraction method comprising an access feature extraction step (access feature extraction unit 27) for calculating an access feature amount based on the above.
 上記構成により、以上のように、本実施例のターゲットユーザ特徴抽出サーバ1は、セッションデータ41とユーザ属性データ42とページ属性データ43及び投稿者属性データ44から、投稿者が希望するターゲットタイプに基づいて抽出対象データ50の項目と値の範囲を決定して抽出対象データ50を生成する。そして、アクセス特徴抽出部27へ値の範囲と抽出対象データ50を入力することで、ウェブサーバ200にコンテンツ210を提供する投稿者が獲得したいユーザ特徴を、ウェブサーバ200にアクセスしたユーザ(ユーザ端末100)の履歴から抽出することが可能となる。また、ターゲットユーザ特徴抽出サーバ1は、投稿者の意図とは異なる新規のユーザも抽出することが可能となって、新たなビジネスの創出することも可能となる。 With the above configuration, as described above, the target user feature extraction server 1 of this embodiment changes from the session data 41, the user attribute data 42, the page attribute data 43, and the poster attribute data 44 to the target type desired by the poster. Based on this, the items and the range of values of the extraction target data 50 are determined to generate the extraction target data 50. Then, by inputting the value range and the extraction target data 50 into the access feature extraction unit 27, the user who accessed the web server 200 (user terminal) obtains the user features that the poster who provides the content 210 to the web server 200 wants to acquire. It is possible to extract from the history of 100). In addition, the target user feature extraction server 1 can extract new users who are different from the poster's intention, and can also create a new business.
 また、ターゲットユーザ特徴抽出サーバ1は、抽出したユーザ特徴のセッションデータ41の特徴をページ属性データ43から抽出できるため、投稿者のコンテンツ210のどのような内容(タグ)にユーザが興味を示したかを絞ることができ、マーケティングを支援することが可能となる。 Further, since the target user feature extraction server 1 can extract the features of the session data 41 of the extracted user features from the page attribute data 43, what kind of content (tag) of the poster's content 210 shows the user's interest. Can be narrowed down and marketing can be supported.
 なお、上記実施例では、ユーザ属性データ42としてユーザの業種と、投稿者属性データ44として投稿者の業種を用いる例を示したが、これに限定されるものではない。例えば、ユーザの趣味や趣向と、投稿者の趣味や趣向を属性データとして利用することができ、このような属性データからターゲットユーザ特徴を抽出することができる。 In the above embodiment, the user's industry is used as the user attribute data 42, and the poster's industry is used as the poster attribute data 44, but the present invention is not limited to this. For example, the hobbies and tastes of the user and the hobbies and tastes of the poster can be used as attribute data, and the target user characteristics can be extracted from such attribute data.
 (2)上記(1)に記載のターゲットユーザ抽出方法であって、前記ターゲット算出ステップ(23)は、前記ターゲットタイプ(461)が定性的な情報の場合には、前記定性的な情報をデータの項目と前記項目の値の範囲に変換する範囲変換ステップ(範囲変換部24)を、含むことを特徴とするターゲットユーザ特徴抽出方法。 (2) In the target user extraction method according to (1) above, in the target calculation step (23), when the target type (461) is qualitative information, the qualitative information is data. A target user feature extraction method, which comprises a range conversion step (range conversion unit 24) for converting the item of
 上記構成により、定性的な情報から抽出対象のデータの項目と前記項目の値の範囲を算出することが可能となって、対象とする投稿者の嗜好に合致したユーザ特徴を抽出することが可能となる。 With the above configuration, it is possible to calculate the item of the data to be extracted and the range of the value of the item from the qualitative information, and it is possible to extract the user characteristics that match the taste of the target poster. It becomes.
 (3)上記(2)に記載のターゲットユーザ特徴抽出方法であって、前記範囲変換ステップ(23)は、予め設定された判定モデル(ターゲット判定モデル25)に前記ターゲットタイプ(461)と前記ユーザデータ(510)と前記投稿者データ(520)を入力して、抽出対象のデータの項目と前記項目の値の範囲を出力することを特徴とするターゲットユーザ特徴抽出方法。 (3) In the target user feature extraction method according to (2) above, in the range conversion step (23), the target type (461) and the user are added to a preset determination model (target determination model 25). A target user feature extraction method characterized in that data (510) and poster data (520) are input and an item of data to be extracted and a range of values of the item are output.
 上記構成により、予め設定されたターゲット判定モデル25にターゲットタイプ(461)とユーザデータ(510)と投稿者データ(520)を入力して、定性的な情報から抽出対象のデータの項目と前記項目の値の範囲を算出することが可能となる。 With the above configuration, the target type (461), user data (510), and poster data (520) are input to the preset target determination model 25, and the data items to be extracted from the qualitative information and the above items. It is possible to calculate the range of values of.
 (4)上記(3)に記載のターゲットユーザ特徴抽出方法であって、前記計算機(1)が、前記ユーザデータ(510)と前記投稿者データ(520)と前記ターゲットタイプ(461)を前記判定モデル(25)に与えて学習させる学習ステップ(学習部31)を、さらに含むことを特徴とするターゲットユーザ特徴抽出方法。 (4) In the target user feature extraction method according to (3) above, the computer (1) determines the user data (510), the poster data (520), and the target type (461). A target user feature extraction method characterized by further including a learning step (learning unit 31) given to a model (25) for learning.
 上記構成により、学習部31は、ウェブサーバ200から取得したセッションデータ41とユーザ属性データ42とページ属性データ43と投稿者属性データ44とターゲットタイプを機械学習させることでターゲット判定モデル25を生成することができる。 With the above configuration, the learning unit 31 generates the target determination model 25 by machine learning the session data 41, the user attribute data 42, the page attribute data 43, the poster attribute data 44, and the target type acquired from the web server 200. be able to.
 (5)上記(4)に記載のターゲットユーザ特徴抽出方法であって、前記学習ステップ(31)は、前記ユーザ属性データ(42)を用いて、ユーザの属性間の類似度を算出する類似度算出ステップ(類似度算出部29)を含むことを特徴とするターゲットユーザ特徴抽出方法。 (5) In the target user feature extraction method according to (4) above, the learning step (31) uses the user attribute data (42) to calculate the similarity between user attributes. A target user feature extraction method comprising a calculation step (similarity calculation unit 29).
 上記構成により、ユーザ属性データ42の業種423を用いて、訪問ページ413にアクセスした複数のユーザ間の業種の特徴量(類似度)の距離を算出しておくことで、アクセス特徴抽出部27では、コンテンツ210を投稿者毎に、距離に応じたユーザのグループをユーザ特徴51として提示することができる。 With the above configuration, the access feature extraction unit 27 can calculate the distance of the feature amount (similarity) of the industry between a plurality of users who have accessed the visit page 413 using the industry 423 of the user attribute data 42. , The content 210 can be presented as a user feature 51 as a group of users according to the distance for each poster.
 また、類似度算出部29では、セッションデータ41とユーザ属性データ42及び投稿者属性データ44から業種間の類似度を算出し、アクセス特徴抽出部27は、この類似度を参照して、投稿者の属性とターゲットタイプから、選択した業種に対する距離と業種に関する情報を抽出することができる。 In addition, the similarity calculation unit 29 calculates the similarity between industries from the session data 41, the user attribute data 42, and the poster attribute data 44, and the access feature extraction unit 27 refers to this similarity to the poster. Information about the distance to the selected industry and the industry can be extracted from the attributes and target type of.
 なお、本発明は上記した実施例に限定されるものではなく、様々な変形例が含まれる。例えば、上記した実施例は本発明を分かりやすく説明するために詳細に記載したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。また、ある実施例の構成の一部を他の実施例の構成に置き換えることが可能であり、また、ある実施例の構成に他の実施例の構成を加えることも可能である。また、各実施例の構成の一部について、他の構成の追加、削除、又は置換のいずれもが、単独で、又は組み合わせても適用可能である。 The present invention is not limited to the above-described embodiment, and includes various modifications. For example, the above-described embodiment is described in detail in order to explain the present invention in an easy-to-understand manner, and is not necessarily limited to the one including all the configurations described. Further, it is possible to replace a part of the configuration of one embodiment with the configuration of another embodiment, and it is also possible to add the configuration of another embodiment to the configuration of one embodiment. Further, for a part of the configuration of each embodiment, any of addition, deletion, or replacement of other configurations can be applied alone or in combination.
 また、上記の各構成、機能、処理部、及び処理手段等は、それらの一部又は全部を、例えば集積回路で設計する等によりハードウェアで実現してもよい。また、上記の各構成、及び機能等は、プロセッサがそれぞれの機能を実現するプログラムを解釈し、実行することによりソフトウェアで実現してもよい。各機能を実現するプログラム、テーブル、ファイル等の情報は、メモリや、ハードディスク、SSD(Solid State Drive)等の記録装置、又は、ICカード、SDカード、DVD等の記録媒体に置くことができる。 Further, each of the above configurations, functions, processing units, processing means, etc. may be realized by hardware by designing a part or all of them by, for example, an integrated circuit. Further, each of the above configurations, functions, and the like may be realized by software by the processor interpreting and executing a program that realizes each function. Information such as programs, tables, and files that realize each function can be placed in a memory, a hard disk, a recording device such as an SSD (Solid State Drive), or a recording medium such as an IC card, an SD card, or a DVD.
 また、制御線や情報線は説明上必要と考えられるものを示しており、製品上必ずしも全ての制御線や情報線を示しているとは限らない。実際には殆ど全ての構成が相互に接続されていると考えてもよい。 Also, the control lines and information lines indicate what is considered necessary for explanation, and not all control lines and information lines are necessarily shown on the product. In practice, it can be considered that almost all configurations are interconnected.

Claims (15)

  1.  プロセッサとメモリを有する計算機が、ウェブサーバのコンテンツにアクセスした履歴情報から投稿者が獲得目標とするユーザ特徴を抽出するターゲットユーザ特徴抽出方法であって、
     前記計算機が、前記ウェブサーバのコンテンツにアクセスしたユーザ端末の履歴情報を格納したセッションデータと、前記ユーザ端末を利用するユーザの属性情報を格納したユーザ属性データと、をユーザデータとして取得するユーザデータ取得ステップと、
     前記計算機が、前記コンテンツの属性を格納したページ属性データと、前記コンテンツを提供した前記投稿者の属性を格納した投稿者属性データと、を投稿者データとして取得する投稿者データ取得ステップと、
     前記計算機が、抽出対象とする投稿者を受け付けて、前記投稿者が獲得目標とするユーザの情報をターゲットタイプとして取得する嗜好取得ステップと、
     前記計算機が、前記投稿者のターゲットタイプから抽出対象のデータの項目と前記項目の値の範囲を算出するターゲット算出ステップと、
     前記計算機が、前記ユーザデータと前記投稿者データから前記項目に対応する抽出対象データを算出するセッション特徴算出ステップと、
     前記計算機が、前記抽出対象データと前記投稿者データから前記項目の値の範囲に基づいてアクセスの特徴量を算出するアクセス特徴抽出ステップと、
    を含むことを特徴とするターゲットユーザ特徴抽出方法。
    This is a target user feature extraction method in which a computer having a processor and a memory extracts the user features that the poster aims to acquire from the history information of accessing the contents of the web server.
    User data in which the computer acquires session data that stores history information of a user terminal that has accessed the contents of the web server and user attribute data that stores attribute information of a user who uses the user terminal as user data. Acquisition steps and
    A poster data acquisition step in which the computer acquires page attribute data storing the attributes of the content and poster attribute data storing the attributes of the poster who provided the content as poster data.
    The preference acquisition step in which the computer accepts the poster to be extracted and acquires the information of the user targeted by the poster as the target type.
    A target calculation step in which the computer calculates an item of data to be extracted from the target type of the poster and a range of values of the item.
    A session feature calculation step in which the computer calculates extraction target data corresponding to the item from the user data and the poster data, and
    An access feature extraction step in which the computer calculates an access feature amount based on the range of values of the item from the extraction target data and the poster data.
    A target user feature extraction method characterized by including.
  2.  請求項1に記載のターゲットユーザ特徴抽出方法であって、
     前記ターゲット算出ステップは、
     前記ターゲットタイプが定性的な情報の場合には、前記定性的な情報をデータの項目と前記項目の値の範囲に変換する範囲変換ステップを、含むことを特徴とするターゲットユーザ特徴抽出方法。
    The target user feature extraction method according to claim 1.
    The target calculation step is
    A target user feature extraction method comprising: when the target type is qualitative information, a range conversion step of converting the qualitative information into a data item and a range of values of the item.
  3.  請求項2に記載のターゲットユーザ特徴抽出方法であって、
     前記範囲変換ステップは、
     予め設定された判定モデルに前記ターゲットタイプと前記ユーザデータと前記投稿者データを入力して、抽出対象のデータの項目と前記項目の値の範囲を出力することを特徴とするターゲットユーザ特徴抽出方法。
    The target user feature extraction method according to claim 2.
    The range conversion step
    A target user feature extraction method characterized in that the target type, the user data, and the poster data are input to a preset determination model, and the item of the data to be extracted and the range of the value of the item are output. ..
  4.  請求項3に記載のターゲットユーザ特徴抽出方法であって、
     前記計算機が、前記ユーザデータと前記投稿者データと前記ターゲットタイプを前記判定モデルに与えて学習させる学習ステップを、さらに含むことを特徴とするターゲットユーザ特徴抽出方法。
    The target user feature extraction method according to claim 3.
    A target user feature extraction method, wherein the computer further includes a learning step in which the user data, the poster data, and the target type are given to the determination model for learning.
  5.  請求項4に記載のターゲットユーザ特徴抽出方法であって、
     前記学習ステップは、
     前記ユーザ属性データを用いて、ユーザの属性間の類似度を算出する類似度算出ステップを含むことを特徴とするターゲットユーザ特徴抽出方法。
    The target user feature extraction method according to claim 4.
    The learning step
    A target user feature extraction method comprising a similarity calculation step of calculating the similarity between user attributes using the user attribute data.
  6.  プロセッサとメモリを有する抽出サーバと、
     ユーザ端末にコンテンツを提供するウェブサーバと、
     前記ウェブサーバに前記コンテンツを提供する投稿端末と、を有するターゲットユーザ特徴抽出システムであって、
     前記ウェブサーバは、前記ユーザ端末が前記コンテンツにアクセスした履歴情報を収集し、
     前記投稿端末は、前記コンテンツの投稿者が獲得目標とするユーザの情報をターゲットタイプとして前記抽出サーバに通知し、
     前記抽出サーバは、
     前記ウェブサーバのコンテンツにアクセスしたユーザ端末の履歴情報を格納したセッションデータと、前記ユーザ端末を利用するユーザの属性情報を格納したユーザ属性データと、をユーザデータとして取得し、前記コンテンツの属性を格納したページ属性データと、前記コンテンツを提供した前記投稿者の属性を格納した投稿者属性データと、を投稿者データとして取得し、抽出対象とする投稿者の前記ターゲットタイプを取得する処理対象選択部と、
     前記投稿者のターゲットタイプから抽出対象のデータの項目と前記項目の値の範囲を算出するターゲット算出部と、
     前記ユーザデータと前記投稿者データから前記項目に対応する抽出対象データを算出するセッション特徴算出部と、
     前記抽出対象データと前記投稿者データから前記項目の値の範囲に基づいてアクセスの特徴量を算出するアクセス特徴抽出部と、
    を有することを特徴とするターゲットユーザ特徴抽出システム。
    An extraction server with a processor and memory,
    A web server that provides content to user terminals and
    A target user feature extraction system having a posting terminal that provides the content to the web server.
    The web server collects history information that the user terminal has accessed the content, and the web server collects the history information.
    The posting terminal notifies the extraction server of the user information targeted by the poster of the content as the target type.
    The extraction server
    Session data storing the history information of the user terminal that has accessed the content of the web server and user attribute data storing the attribute information of the user who uses the user terminal are acquired as user data, and the attributes of the content are obtained. The stored page attribute data and the poster attribute data storing the attributes of the poster who provided the content are acquired as poster data, and the target type of the poster to be extracted is acquired. Department and
    A target calculation unit that calculates the item of data to be extracted from the target type of the poster and the range of values of the item, and
    A session feature calculation unit that calculates extraction target data corresponding to the item from the user data and the poster data, and
    An access feature extraction unit that calculates an access feature amount based on the range of values of the item from the extraction target data and the poster data, and an access feature extraction unit.
    A target user feature extraction system characterized by having.
  7.  請求項6に記載のターゲットユーザ特徴抽出システムであって、
     前記ターゲット算出部は、
     前記ターゲットタイプが定性的な情報の場合には、前記定性的な情報をデータの項目と前記項目の値の範囲に変換する範囲変換部を、有することを特徴とするターゲットユーザ特徴抽出システム。
    The target user feature extraction system according to claim 6.
    The target calculation unit
    A target user feature extraction system comprising a range conversion unit that converts the qualitative information into a data item and a range of values of the item when the target type is qualitative information.
  8.  請求項7に記載のターゲットユーザ特徴抽出システムであって、
     前記範囲変換部は、
     予め設定された判定モデルに前記ターゲットタイプと前記ユーザデータと前記投稿者データを入力して、抽出対象のデータの項目と前記項目の値の範囲を出力することを特徴とするターゲットユーザ特徴抽出システム。
    The target user feature extraction system according to claim 7.
    The range conversion unit
    A target user feature extraction system characterized in that the target type, the user data, and the poster data are input to a preset determination model, and the item of the data to be extracted and the range of the value of the item are output. ..
  9.  請求項8に記載のターゲットユーザ特徴抽出システムであって、
     前記ユーザデータと前記投稿者データと前記ターゲットタイプを前記判定モデルに与えて学習させる学習部を、さらに有することを特徴とするターゲットユーザ特徴抽出システム。
    The target user feature extraction system according to claim 8.
    A target user feature extraction system, further comprising a learning unit that gives the user data, the poster data, and the target type to the determination model for learning.
  10.  請求項9に記載のターゲットユーザ特徴抽出システムであって、
     前記学習部は、
     前記ユーザ属性データを用いて、ユーザの属性間の類似度を算出する類似度算出ステップを含むことを特徴とするターゲットユーザ特徴抽出システム。
    The target user feature extraction system according to claim 9.
    The learning unit
    A target user feature extraction system including a similarity calculation step for calculating the similarity between user attributes using the user attribute data.
  11.  プロセッサとメモリを有して、ウェブサーバのコンテンツにアクセスした履歴情報から投稿者が獲得目標とするユーザ特徴を抽出するターゲットユーザ特徴抽出サーバであって、
     前記ウェブサーバのコンテンツにアクセスしたユーザ端末の履歴情報を格納したセッションデータと、前記ユーザ端末を利用するユーザの属性情報を格納したユーザ属性データと、をユーザデータとして取得し、前記コンテンツの属性を格納したページ属性データと、前記コンテンツを提供した前記投稿者の属性を格納した投稿者属性データと、を投稿者データとして取得し、抽出対象とする投稿者と、前記コンテンツの投稿者が獲得目標とするユーザの情報をターゲットタイプとして取得する処理対象選択部と、
     前記投稿者のターゲットタイプから抽出対象のデータの項目と前記項目の値の範囲を算出するターゲット算出部と、
     前記ユーザデータと前記投稿者データから前記項目に対応する抽出対象データを算出するセッション特徴算出部と、
     前記抽出対象データと前記投稿者データから前記項目の値の範囲に基づいてアクセスの特徴量を算出するアクセス特徴抽出部と、
    を有することを特徴とするターゲットユーザ特徴抽出サーバ。
    It is a target user feature extraction server that has a processor and memory and extracts the user features that the poster aims to acquire from the history information of accessing the contents of the web server.
    Session data storing the history information of the user terminal that has accessed the content of the web server and user attribute data storing the attribute information of the user who uses the user terminal are acquired as user data, and the attributes of the content are obtained. The stored page attribute data and the poster attribute data storing the attributes of the poster who provided the content are acquired as poster data, and the poster to be extracted and the poster of the content are the acquisition targets. The processing target selection unit that acquires the information of the user to be the target type, and
    A target calculation unit that calculates the item of data to be extracted from the target type of the poster and the range of values of the item, and
    A session feature calculation unit that calculates extraction target data corresponding to the item from the user data and the poster data, and
    An access feature extraction unit that calculates an access feature amount based on the range of values of the item from the extraction target data and the poster data, and an access feature extraction unit.
    A target user feature extraction server characterized by having.
  12.  請求項11に記載のターゲットユーザ特徴抽出サーバであって、
     前記ターゲット算出部は、
     前記ターゲットタイプが定性的な情報の場合には、前記定性的な情報をデータの項目と前記項目の値の範囲に変換する範囲変換部を、有することを特徴とするターゲットユーザ特徴抽出サーバ。
    The target user feature extraction server according to claim 11.
    The target calculation unit
    A target user feature extraction server comprising a range conversion unit that converts the qualitative information into a data item and a range of values of the item when the target type is qualitative information.
  13.  請求項12に記載のターゲットユーザ特徴抽出サーバであって、
     前記範囲変換部は、
     予め設定された判定モデルに前記ターゲットタイプと前記ユーザデータと前記投稿者データを入力して、抽出対象のデータの項目と前記項目の値の範囲を出力することを特徴とするターゲットユーザ特徴抽出サーバ。
    The target user feature extraction server according to claim 12.
    The range conversion unit
    A target user feature extraction server characterized in that the target type, the user data, and the poster data are input to a preset determination model, and the item of the data to be extracted and the range of the value of the item are output. ..
  14.  請求項13に記載のターゲットユーザ特徴抽出サーバであって、
     前記ユーザデータと前記投稿者データと前記ターゲットタイプを前記判定モデルに与えて学習させる学習部を、さらに有することを特徴とするターゲットユーザ特徴抽出サーバ。
    The target user feature extraction server according to claim 13.
    A target user feature extraction server further comprising a learning unit that gives the user data, the poster data, and the target type to the determination model for learning.
  15.  請求項14に記載のターゲットユーザ特徴抽出サーバであって、
     前記学習部は、
     前記ユーザ属性データを用いて、ユーザの属性間の類似度を算出する類似度算出ステップを含むことを特徴とするターゲットユーザ特徴抽出サーバ。
    The target user feature extraction server according to claim 14.
    The learning unit
    A target user feature extraction server including a similarity calculation step for calculating the similarity between user attributes using the user attribute data.
PCT/JP2021/001917 2020-03-09 2021-01-20 Target user feature extraction method, target user feature extraction system, and target user feature extraction server WO2021181900A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202180008023.5A CN114902196A (en) 2020-03-09 2021-01-20 Target user feature extraction method, target user feature extraction system and target user feature extraction server
DE112021000337.2T DE112021000337T5 (en) 2020-03-09 2021-01-20 Target user trait extraction method, target user trait extraction system and target user trait extraction server

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2020039825A JP2021140646A (en) 2020-03-09 2020-03-09 Target user feature extraction method, target user feature extraction system and target user feature extraction server
JP2020-039825 2020-03-09

Publications (1)

Publication Number Publication Date
WO2021181900A1 true WO2021181900A1 (en) 2021-09-16

Family

ID=77668773

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/001917 WO2021181900A1 (en) 2020-03-09 2021-01-20 Target user feature extraction method, target user feature extraction system, and target user feature extraction server

Country Status (4)

Country Link
JP (1) JP2021140646A (en)
CN (1) CN114902196A (en)
DE (1) DE112021000337T5 (en)
WO (1) WO2021181900A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2023141928A (en) * 2022-03-24 2023-10-05 株式会社博報堂Dyホールディングス Information processing system, computer program and information processing method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002082931A (en) * 2000-09-07 2002-03-22 Toppan Forms Co Ltd System for preparing on-demand document
JP2019020944A (en) * 2017-07-14 2019-02-07 富士ゼロックス株式会社 Display device and program
JP2019117671A (en) * 2019-05-07 2019-07-18 ヤフー株式会社 Information processing device, information processing method, and information processing program

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002041396A (en) * 2000-07-28 2002-02-08 Bell-Park Co Ltd Advertisement distributor system and advertisement distributing method
US20050240352A1 (en) * 2004-04-23 2005-10-27 Invitrogen Corporation Online procurement of biologically related products/services using interactive context searching of biological information
JP5089684B2 (en) * 2007-04-06 2012-12-05 インターナショナル・ビジネス・マシーンズ・コーポレーション Technology for generating service programs
JP2009157500A (en) * 2007-12-25 2009-07-16 Ntt Docomo Inc Distribution server and distribution method
CN101996215B (en) * 2009-08-27 2013-07-24 阿里巴巴集团控股有限公司 Information matching method and system applied to e-commerce website
JP5401261B2 (en) 2009-10-30 2014-01-29 株式会社日立製作所 Information recommendation method and apparatus
JP5124680B1 (en) * 2011-11-30 2013-01-23 楽天株式会社 Information processing apparatus, information processing method, information processing program, and recording medium
JP2014010627A (en) * 2012-06-29 2014-01-20 Konami Digital Entertainment Co Ltd Management device, service provision system, control method for management device and program for management device
CN104737209A (en) * 2012-11-22 2015-06-24 索尼公司 Information processing device, system, information processing method and program
JP6280381B2 (en) 2014-02-07 2018-02-14 Kddi株式会社 Item presentation apparatus, item presentation method, and program
JP6264946B2 (en) * 2014-03-03 2018-01-24 富士通株式会社 Data collection method and data collection apparatus
JP5878218B1 (en) * 2014-10-07 2016-03-08 株式会社マクロミル Advertising evaluation system
JP6822923B2 (en) * 2017-08-24 2021-01-27 ヤフー株式会社 Information processing equipment, information processing methods, and information processing programs
JP2020039825A (en) 2018-09-13 2020-03-19 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America Information processing method and washing machine

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002082931A (en) * 2000-09-07 2002-03-22 Toppan Forms Co Ltd System for preparing on-demand document
JP2019020944A (en) * 2017-07-14 2019-02-07 富士ゼロックス株式会社 Display device and program
JP2019117671A (en) * 2019-05-07 2019-07-18 ヤフー株式会社 Information processing device, information processing method, and information processing program

Also Published As

Publication number Publication date
JP2021140646A (en) 2021-09-16
DE112021000337T5 (en) 2022-09-15
CN114902196A (en) 2022-08-12

Similar Documents

Publication Publication Date Title
JP6356744B2 (en) Method and system for displaying cross-website information
EP2304619B1 (en) Correlated information recommendation
JP5913722B1 (en) Information processing system and program
US7813965B1 (en) Method, system, and computer readable medium for ranking and displaying a pool of links identified and aggregated from multiple customer reviews pertaining to an item in an electronic catalog
US8224823B1 (en) Browsing history restoration
US9734503B1 (en) Hosted product recommendations
US20200273054A1 (en) Digital receipts economy
US8341101B1 (en) Determining relationships between data items and individuals, and dynamically calculating a metric score based on groups of characteristics
WO2017158798A1 (en) Information processing device, information distribution system, information processing method, and information processing program
CN107562613B (en) Program testing method, device and system
JP2010224873A (en) Commodity retrieval server, commodity retrieval method, program, and recording medium
JP2009193465A (en) Information processor, information providing system, information processing method, and program
KR100987058B1 (en) Method and system for providing advertising service using the keywords of internet contents and program recording medium
US20240037545A1 (en) Systems and methods for associating a user&#39;s shopping experiences across multiple channels
US11494788B1 (en) Triggering supplemental channel communications based on data from non-transactional communication sessions
US10606832B2 (en) Search system, search method, and program
US20140372220A1 (en) Social Media Integration for Offer Searching
WO2021181900A1 (en) Target user feature extraction method, target user feature extraction system, and target user feature extraction server
KR20220026255A (en) Recommendation System for Health Supplement by Using Big Data
JP6567688B2 (en) Management device, management method, non-transitory recording medium, and program
US20070276720A1 (en) Indexing of a focused data set through a comparison technique method and apparatus
JP2017091054A (en) Advertising system and advertisement distributing method
KR102429104B1 (en) Product catalog automatic classification system based on artificial intelligence
JP6320353B2 (en) Digital marketing system
US11042896B1 (en) Content influencer scoring system and related methods

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21767069

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 21767069

Country of ref document: EP

Kind code of ref document: A1