WO2020037917A1 - User behavior data recommendation method, server and computer readable medium - Google Patents

User behavior data recommendation method, server and computer readable medium Download PDF

Info

Publication number
WO2020037917A1
WO2020037917A1 PCT/CN2018/123508 CN2018123508W WO2020037917A1 WO 2020037917 A1 WO2020037917 A1 WO 2020037917A1 CN 2018123508 W CN2018123508 W CN 2018123508W WO 2020037917 A1 WO2020037917 A1 WO 2020037917A1
Authority
WO
WIPO (PCT)
Prior art keywords
user behavior
data
recommendation
behavior information
rule base
Prior art date
Application number
PCT/CN2018/123508
Other languages
French (fr)
Chinese (zh)
Inventor
王翼
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020037917A1 publication Critical patent/WO2020037917A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0255Targeted advertisements based on user history

Definitions

  • the present application relates to the field of data analysis technology, and in particular, to a method for recommending user behavior data, a server, and a computer-readable medium.
  • the embodiment of the present application provides a method for recommending user behavior data, which can implement accurate real-time recommendation of user behavior data, and make the recommended user behavior data free of dirty data.
  • an embodiment of the present application provides a method for recommending user behavior data.
  • the method includes:
  • an embodiment of the present application provides a server, where the server includes:
  • a data cleaning unit configured to read user behavior information, and perform data cleaning on the user behavior information to generate user behavior data
  • a rule base generating unit configured to perform content recognition processing on the user behavior data to form a rule base, where the rule base is used to classify and store the user behavior data;
  • An extraction unit configured to extract recommendation data from the rule base according to conditions of a front-end recommendation system
  • a recommendation unit configured to push the recommendation data to a front-end recommendation system.
  • an embodiment of the present application provides another server, including a processor, a memory, and a communication module.
  • the memory is used to store program code
  • the processor is used to call the program code to execute the first aspect. And any of its alternatives.
  • an embodiment of the present application provides a computer-readable storage medium.
  • the computer storage medium stores a computer program, where the computer program includes program instructions, and the program instructions cause the processing when executed by a processor.
  • the processor executes the method of the first aspect and any one of its optional ways.
  • user behavior information is obtained in real time, and then the obtained user behavior information is subjected to data cleaning to obtain user behavior data without dirty data, and then content identification is performed on the user behavior data, and the classified To the rule base; finally, extract the recommendation data required by the front-end recommendation system from the above rule base according to the conditions of the front-end recommendation system, and push the recommendation data to the corresponding recommendation system, so that the front-end recommendation system obtains the recommendation system And there is no recommendation data of dirty data, so that the front-end recommendation system realizes accurate and real-time recommendation.
  • FIG. 1 is a schematic flowchart of a user behavior data recommendation method according to an embodiment of the present application
  • FIG. 2 is a schematic flowchart of another user behavior data recommendation method according to an embodiment of the present application.
  • FIG. 3 is a schematic block diagram of a server according to an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a server according to an embodiment of the present application.
  • FIG. 1 is a schematic flowchart of a method for recommending user behavior data according to an embodiment of the present application. As shown in the figure, the method may include:
  • the above-mentioned user behavior is an event consisting of time, place, person, interaction, and interactive content.
  • a user search is an event, at what time, on which platform, and which Internet protocol (Internet Protocol (IP) address, what was searched, and what was searched for.
  • IP Internet Protocol
  • This is a complete event and a definition of user behavior; we can define tens of thousands of such events in a website or an application (Application, APP). With such events, you can connect user behavior and observe.
  • a user is a new user after entering the website for the first time, and he may want to register, so the registration behavior is also an event. Registration requires filling in personal information, after which he may start searching for purchases, all of which is user behavior information.
  • the user behavior data is obtained by reading the user behavior information of the user in real time, and then performing data cleaning processing on the read user behavior information.
  • the above-mentioned data cleaning processing on the read user behavior information may include operations such as error data cleaning, missing value data cleaning, repeated value data cleaning, or inconsistent data cleaning on the user behavior information.
  • the data acquisition component spout of the storm framework a real-time processing framework for data streams, is used to pull user behavior information data, and then spout distributes the pulled user behavior information data to the data processor bolt in storm according to preset rules.
  • To perform various processing on user behavior data such as filtering and cleaning user behavior information. Because there may be many dirty data in the user behavior data pulled by the above spout, for example, there may be some duplicate data, erroneous data, and incomplete data in the user behavior data. Therefore, after the user behavior information data is obtained, the user behavior information data is first subjected to a cleaning process.
  • the foregoing cleaning of user behavior information data includes cleaning of erroneous data, cleaning of missing values, cleaning of duplicate values, and cleaning of inconsistent data. After the user behavior information data is cleaned, the cleaned user behavior information data is passed to the next bolt component to continue processing.
  • the above storm is a free open source, distributed, highly fault-tolerant real-time processing framework.
  • Storm supports creating topologies to transform data streams without end points.
  • Storm is often used in the fields of real-time analysis, online machine learning, continuous computing, distributed remote calls, and so on.
  • the above spout is a component that generates a source data stream in storm. Generally, spout reads data from an external data source and then converts it into internal source data.
  • the above bolt is a component that performs data processing in Storm, and can perform any operation such as filtering, function operations, merging, and writing to a database.
  • the user behavior needs to be tracked, that is, the user behavior information of the user is collected.
  • the collection of user behavior of the user may include collecting user behavior information based on a web server log or a client.
  • the method of collecting user behavior information based on the Web server logs is relatively common.
  • the log file is automatically generated by the web server, and the cost is small. It is relatively easy to develop a data analysis tool based on the log files.
  • the technical method directly obtains the behavior data of the user's interaction with the website from the client. Collecting user behavior information data from the client can reduce human interference factors, and the obtained data is more true and accurate, which solves the shortcomings of server-side collection and reduces Consumption of servers.
  • the user behavior information is sent to a message queue, so that the real-time processing framework strom pulls the user behavior information from the message queue.
  • Tracker system is a set of professional tooling and intelligent database management system of tools, fixtures and measuring tools. It can manage the overall process of tooling, tools, fixtures and measuring tools in the production process of the enterprise, and track tooling and toolholder measuring tools in real time
  • the process of purchasing, loading and unloading, repairing, scrapping, and calibration helps warehouse managers, craftsmen, manufacturing engineers, tooling and tool holder gauge supervisors, etc. to more effectively improve the tool management process and reduce production costs.
  • a Tracker system may be adopted to collect the foregoing user behavior information.
  • a tracker system When a website or app reaches a certain number of users, a tracker system is generally required to collect user behavior (such as user IP address, page source, city name, browser version, button location, etc.), page access performance, and abnormal errors. , And then report to the log server according to a certain policy. Search, recommendation, advertising center and other development teams analyze these logs to adjust and develop various functions; product managers, senior management, etc. use these logs to optimize operations and make correct decisions in a timely manner.
  • the Tracker system plays an important role in a mature application. With the development of the business, the real-time requirements for it are getting higher and higher.
  • the Tracker system supports automatic dotted fields, automatic extension fields, and so on.
  • the Tracker system's Application Programming Interface is embedded in the event of each page of the website or application.
  • a certain policy is set to send to the log server, and then synchronized.
  • To the message queue as a data buffer.
  • Hbase is a distributed, column-oriented open source database
  • MYSQL is a relational database.
  • User access will continuously generate data, which is either stored locally and sent to related applications when needed, or stored in a unified central storage area.
  • the generated data will be captured, filtered, and processed by the Spout in Storm (such as protocol analysis, format analysis, data verification, etc.) between applications, and then sent to Bolt for data analysis.
  • the available data is formed and stored in In a persistent medium (such as a DB) for other applications to obtain.
  • the user behavior information is subjected to cleaning processing
  • content recognition processing is performed on the cleaned user behavior information, so as to classify the user behavior information according to the specific content of the user behavior information and a preset rule, and then
  • the user behavior information is stored in a database into a rule base according to the classification of preset rules.
  • the classifying the user behavior information according to the specific content and preset rules of the user behavior information may include: identifying the cleaned user behavior information, and classifying the cleaned user behavior information according to browsing behavior, click behavior, The input or search behavior is classified, and then sent to the corresponding bolt for processing (such as structured processing). After the corresponding bolt of the next layer above processes the user behavior information, the processed user behavior information is persistently stored in a database to form a rule base.
  • the browsing behavior is that user A browses a movie-related webpage.
  • the content of the browsing user behavior information is identified, it is sent to the corresponding bolt that handles the browsing behavior.
  • the bolt is structured according to the content of the browsing behavior. Specifically, the browsing behavior is described as follows: user, URL, The format of the theme, category, author, director, starring, release year is structured and then stored in the above rule base.
  • a trajectory enhancement algorithm is used from the rule base to extract the eligible data from the rule base according to the conditions of different front-end recommendation systems, and then The extracted data is pushed to the corresponding front-end recommendation system.
  • the function of the above-mentioned trajectory enhancement algorithm is to extract a large amount of data from the rule base, which includes the Uniform Resource Locator (URL), the access traffic, and the keyword information of the visit.
  • Algorithm processing, and then root the front-end recommendation system conditions such as merge the same URLs together, merge the upstream traffic and downstream traffic and sort them, and then sort the URLs based on the traffic summary, taking the first 80% of the total traffic, because The URL of 80% of the traffic is a URL that users often visit.
  • the data in the rule base is further processed and saved to the database.
  • the front-end system recommends these data to the user from the database.
  • the already extracted data is stored in the historical behavior trajectory database, and At the same time, the extracted data in the rule base is deleted to save the storage space of the rule base.
  • user behavior information is obtained in real time, and then the obtained user behavior information is subjected to data cleaning to obtain user behavior data without dirty data, and then content identification is performed on the user behavior data. And sort the classification into the rule base; finally, according to the conditions of the front-end recommendation system, extract the recommendation data required by the front-end recommendation system from the above rule base, and push the recommendation data to the corresponding recommendation system, so that the front-end recommendation system obtains the Recommendation data required by the recommendation system without dirty data, so that the front-end recommendation system implements real-time accurate recommendations.
  • FIG. 2 is a schematic flowchart of another user behavior data recommendation method according to an embodiment of the present application. As shown in the figure, the method may include:
  • 201 Obtain user behavior information of a user from a web log file or from a user terminal.
  • the above user behavior information is sent to a message queue serving as a buffer area, for example, a message queue such as Kafka or MetaQ, so that the spout of the subsequent storm framework is removed from the message queue.
  • a message queue serving as a buffer area
  • Kafka is a high-throughput distributed publish and subscribe message system, which can process all action flow data in consumer-scale websites
  • MetaQ is a complete queue model message middleware, and the server is written in Java. Can be deployed on multiple software and hardware platforms.
  • the data acquisition component spout of the storm real-time processing framework reads the user behavior information from the message queue, and then distributes the user behavior information to the processing component bolt.
  • the user behavior information is distributed to different bolts for processing according to the type of the obtained user behavior information.
  • the processing component bolt of the real-time processing framework performs error data cleaning, missing value data cleaning, repeated value data cleaning, or inconsistent data cleaning on the user behavior information to obtain user behavior data.
  • the user behavior information data is first subjected to a cleaning process.
  • the foregoing cleaning of user behavior information data includes cleaning of erroneous data, cleaning of missing values, cleaning of duplicate values, and cleaning of inconsistent data. After the above user behavior information data is cleaned, the cleaned user behavior information data is passed to the next bolt to continue processing.
  • the user behavior data is classified according to browsing behavior, click behavior, input behavior, or search behavior.
  • the classifying the user behavior information according to the specific content and preset rules of the user behavior information may include: identifying the cleaned user behavior information, and classifying the cleaned user behavior information according to browsing behavior, click behavior, The input behavior (or search behavior) is classified, and then sent to the corresponding bolt for processing (such as structured processing).
  • the classified user behavior data is structured and stored to form a rule base.
  • the processed user behavior information is persistently stored in a database to form a rule base.
  • the browsing behavior is that user A browses a movie-related webpage.
  • the content of the browsing user behavior information is identified, it is sent to the corresponding bolt that handles the browsing behavior.
  • the bolt is structured according to the content of the browsing behavior. Specifically, the browsing behavior is described as follows: user, URL, The format of the theme, category, author, director, starring, release year is structured and then stored in the above rule base.
  • 207 Use a trajectory enhancement algorithm to extract qualified recommendation data from the rule base according to the conditions of different front-end recommendation systems from the rule base, and push the recommendation data to the corresponding front-end recommendation system.
  • a trajectory enhancement algorithm is used from the rule base to extract the eligible data from the rule base according to the conditions of different front-end recommendation systems, and then The extracted data is pushed to the corresponding front-end recommendation system.
  • the function of the above trajectory enhancement algorithm is to extract a large amount of data from the rule base, which includes the accessed URL and access traffic, the keyword information of the access, etc., and then processed by the trajectory enhancement algorithm, and then based on certain conditions, such as The same URLs are merged together, and the upstream traffic and downstream traffic that are accessed are merged and sorted, and then the URLs are sorted by the traffic summary, taking the first 80% of the total traffic, because the URLs of 80% of the traffic are URLs that users often visit,
  • the data in the rule base is further processed and saved to the database, and the front-end system recommends these data to the user from the database.
  • the previously recommended user behavior data may be analyzed and processed on the user ’s historical behavior in the future. Therefore, after the user behavior data is recommended to the corresponding front-end recommendation system, the above recommendation is recommended.
  • the data is stored in the historical behavior trajectory database, so that the above-mentioned user behavior data is subsequently obtained from the historical behavior trajectory database for analysis and processing.
  • the extracted user behavior data is deleted from the above rule base.
  • the embodiment of the present application collects user behavior information of a user through a web log file or a user terminal, and then sends the user behavior information to a message queue, and then reads the user behavior information from the message queue, and analyzes the user behavior.
  • the information is subjected to data cleaning to generate user behavior data, so as to delete duplicate data, erroneous data, and complete incomplete data in the user behavior information; then, perform content identification, classification and sorting on the above user behavior data to form a rule base; Extract the qualified recommendation data from the above rule base and push the above recommendation data to the front-end recommendation system.
  • FIG. 3 is a schematic block diagram of a server provided by an embodiment of the present application.
  • the server in this embodiment includes a data cleaning unit 310, a rule base generation unit 320, an extraction unit 330, and a recommendation unit 340.
  • the data cleaning unit 310 is configured to read user behavior information, and perform data cleaning on the user behavior information to generate user behavior data.
  • a rule base generating unit 320 is configured to perform content recognition processing on the user behavior data to form a rule base, and the rule base is used to classify and store the user behavior data;
  • An extraction unit 330 configured to extract qualified recommendation data from the above rule base according to the conditions of the front-end recommendation system
  • the recommendation unit 340 is configured to push the above recommendation data to a front-end recommendation system.
  • the embodiment of the present application generates user behavior data by reading the user behavior information and performing data cleaning on the above user behavior information, so as to delete duplicate data, erroneous data, and complete incomplete data in the user behavior information; then , Performing content recognition processing on the user behavior data to form a rule base; extracting qualified recommendation data from the rule base, and pushing the recommendation data to a front-end recommendation system.
  • user behavior data when recommending user behavior data, real-time and accurate recommendation can be achieved, and the recommended user behavior data does not contain dirty data.
  • the server further includes:
  • An obtaining unit 350 configured to obtain user behavior information from a web log file, or directly obtain the user behavior information from a user terminal;
  • the sending unit 360 is configured to send the user behavior information to a message queue.
  • the data cleaning includes: erroneous data cleaning, missing value data cleaning, repeated value data cleaning, or inconsistent data cleaning.
  • the data cleaning unit 310 includes:
  • the reading unit 311 is configured to read the user behavior information from the message queue through the data acquisition component spout of the storm real-time processing framework, and then distribute the user behavior information to the processing component bolt;
  • the cleaning unit 313 is configured to use the processing component bolt of the real-time processing framework to perform error data cleaning, missing value data cleaning, repeated value data cleaning, or inconsistent data cleaning on the user behavior information to obtain user behavior data.
  • the reading unit is configured to pull the user behavior information from the message queue through a spout component of the Storm framework;
  • the distribution unit is configured to distribute the user behavior information to a Bolt component of a Storm framework.
  • the rule base generating unit 320 includes:
  • a classification unit 321, configured to classify the user behavior data according to a browsing behavior, a click behavior, an input behavior, or a search behavior after the user behavior data performs content recognition processing;
  • the first storage unit 323 stores the user behavior data that is structured and processed to form the rule base.
  • the above-mentioned extraction unit 330 is configured to use a trajectory enhancement algorithm to extract the recommendation data that meets the conditions from the rule base according to the conditions of different front-end recommendation systems from the rule base;
  • the recommendation unit 340 is configured to push the recommendation data to a corresponding front-end recommendation system.
  • the conditions of the front-end recommendation system may include merging the same URLs together, merging and sorting the upstream and downstream traffic visited, and then sorting the URLs based on the traffic summary, and taking the previous preset of the total traffic Percent URL.
  • the server further includes:
  • a second storage unit 370 configured to store the above-mentioned recommendation data in a historical behavior trajectory database
  • the deleting unit 380 is configured to delete the recommended data in the rule base.
  • the embodiment of the present application collects user behavior information of a user through a web log file or a user terminal, and then sends the user behavior information to a message queue, and then reads the user behavior information from the message queue, and analyzes the user behavior.
  • the information is subjected to data cleaning to generate user behavior data, so as to delete duplicate data, erroneous data, and complete incomplete data in the user behavior information; then, perform content identification, classification and sorting on the above user behavior data to form a rule base; Extract the qualified recommendation data from the above rule base and push the above recommendation data to the front-end recommendation system.
  • FIG. 4 is a device provided by an embodiment of the present application.
  • the device may be a server.
  • the device includes: one or more processors 401; one or more input devices 402, and one or more Output devices 403 and memory 404.
  • the processor 401, the input device 402, the output device 403, and the memory 404 are connected through a bus 405.
  • the memory 402 is configured to store instructions, and the processor 401 is configured to execute the instructions stored in the memory 402.
  • the processor 401 is configured to: read user behavior information, perform data cleaning on the user behavior information to generate user behavior data; perform content recognition processing on the user behavior data to form a rule base,
  • the rule base is used to classify and store the user behavior data; extract recommendation data from the rule base according to the conditions of the front-end recommendation system, and push the recommendation data to the front-end recommendation system.
  • the processor 401 may be a central processing unit (CPU), and the processor may also be another general-purpose processor or a digital signal processor (DSP).
  • DSP digital signal processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • a general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the input device 402 may include a touchpad, a fingerprint sensor (for collecting user's fingerprint information and fingerprint orientation information), a microphone, etc.
  • the output device 403 may include a display (for example, a Liquid Crystal Display (LCD), etc.), Speakers, etc.
  • LCD Liquid Crystal Display
  • the memory 404 may include a read-only memory and a random access memory, and provide instructions and data to the processor 401. A portion of the memory 404 may also include non-volatile random access memory. For example, the memory 404 may also store device type information.
  • the processor 401, the input device 402, and the output device 403 described in the embodiments of the present application may execute the implementation manner described in the first embodiment of a user behavior data recommendation method provided in the embodiments of the present application.
  • the implementation manners in the second embodiment and the third embodiment may also implement the server implementation manners described in the embodiments of the present application, and details are not described herein again.
  • a computer-readable storage medium stores a computer program, and the computer program is implemented when a processor executes: reading user behavior information, and reading the user behavior information. Perform data cleaning to generate user behavior data; perform content recognition processing on the user behavior data to form a rule base, the rule base is used to classify and store the user behavior data; and extract recommendation data from the rule base according to the conditions of the front-end recommendation system , Push the above recommendation data to the front-end recommendation system.
  • the computer-readable storage medium may be an internal storage unit of the terminal described in any one of the foregoing embodiments, such as a hard disk or a memory of the terminal.
  • the computer-readable storage medium may also be an external storage device of the terminal, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) card, and a flash memory card provided on the terminal. (Flash Card), etc.
  • the computer-readable storage medium may further include both an internal storage unit of the terminal and an external storage device.
  • the computer-readable storage medium is used to store the computer program and other programs and data required by the terminal.
  • the computer-readable storage medium described above may also be used to temporarily store data that has been or will be output.
  • FIG. 5 is a schematic diagram of a server structure provided by an embodiment of the present application.
  • the server 500 may have a large difference due to different configurations or performance, and may include one or more central processing units (CPUs) 522 (for example, , One or more processors) and a memory 532, one or more storage media 530 (eg, one or more storage devices) storing application programs 542 or data 544.
  • the memory 532 and the storage medium 530 may be temporary storage or persistent storage.
  • the program stored in the storage medium 530 may include one or more modules (not shown), and each module may include a series of instruction operations on the server.
  • the central processing unit 522 may be configured to communicate with the storage medium 530, and execute a series of instruction operations in the storage medium 530 on the server 500.
  • the server 500 may also include one or more power sources 526, one or more wired or wireless network interfaces 550, one or more input-output interfaces 558, and / or, one or more operating systems 541, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM and more.
  • the steps performed by the server in the above embodiment may be based on the server structure shown in FIG. 5.
  • the disclosed systems, servers, and methods may be implemented in other ways.
  • the device embodiments described above are merely schematic.
  • the division of the above units is only a logical function division.
  • multiple units or components may be combined or may be combined. Integration into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may also be electrical, mechanical or other forms of connection.
  • the units described above as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, which may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions in the embodiments of the present application.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above integrated unit may be implemented in the form of hardware or in the form of software functional unit.
  • the technical solution of this application is essentially a part that contributes to the existing technology, or all or part of the technical solution may be embodied in the form of a software product, which is stored in a storage medium
  • a computer device which may be a personal computer, a server, or a network device, etc.
  • the foregoing storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program codes .

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • Finance (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Disclosed in the embodiments of the present application are a user behavior data recommendation method, a server and a computer readable medium, relating to the analysis and sorting of user behavior data and implementing intelligent recommendation of user behavior data. The method comprises: reading user behavior information, and performing data cleaning on the user behavior information, so as to generate user behavior data; performing content recognition processing on the user behavior data to form a rule library, the rule library being configured to classify and store the user behavior data; extracting recommendation data from the rule library according to conditions of a front-end recommendation system, and pushing the recommendation data to the front-end recommendation system. The embodiments of the present application enable real-time and accurate recommendation when recommending user behavior data, so that no dirty data exists in the recommended user behavior data.

Description

一种用户行为数据推荐方法、服务器及计算机可读介质User behavior data recommendation method, server and computer-readable medium
本申请要求于2018年8月22日提交中国专利局、申请号为2018109655825、申请名称为“一种用户行为数据推荐方法、服务器及计算机可读介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed on August 22, 2018 with the Chinese Patent Office, application number 2018109655825, and application name "A Method for Recommending User Behavior Data, Servers, and Computer-readable Media", the entire contents of which Incorporated by reference in this application.
技术领域Technical field
本申请涉及数据分析技术领域,尤其涉及一种用户行为数据推荐方法、服务器及计算机可读介质。The present application relates to the field of data analysis technology, and in particular, to a method for recommending user behavior data, a server, and a computer-readable medium.
背景技术Background technique
伴随着信息科技日新月异的发展,信息呈现出爆发式的膨胀,人们获取信息的途径也更加多样、更加便捷,同时对于信息的时效性要求也越来越高。例如,如果用户昨天在淘宝上买了一件衣服,而今天想买一副泳镜去游泳,但是却发现系统在不遗余力地给他推荐裤子和衣服,却根本对他今天寻找泳镜的行为视而不见。这便是因为系统推荐是按照用户昨天的行为轨迹来向用户进行推荐导致的。也就是说,现有的推荐算法是通过抓取用户的历史轨迹记录,经过算法分析,T+1天后才能分析出用户的未来趋势,从而导致推荐不准确。With the rapid development of information technology, information has shown an explosive expansion, and people's access to information has become more diverse and more convenient. At the same time, the timeliness of information has become higher and higher. For example, if a user bought a piece of clothing on Taobao yesterday and wants to buy a pair of swimming goggles to swim today, but finds that the system spares no effort to recommend pants and clothes to him, he simply ignores his behavior of looking for swimming goggles today. . This is because the system recommendation is based on the behavior track of the user yesterday to make recommendations to the user. That is to say, the existing recommendation algorithm is to capture the user's historical trajectory, and after algorithm analysis, the user's future trend can be analyzed after T + 1 days, resulting in inaccurate recommendation.
发明内容Summary of the Invention
本申请实施例提供一种用户行为数据推荐方法,可实现用户行为数据的实时精准推荐,且使得推荐的用户行为数据不存在脏数据。The embodiment of the present application provides a method for recommending user behavior data, which can implement accurate real-time recommendation of user behavior data, and make the recommended user behavior data free of dirty data.
第一方面,本申请实施例提供了一种用户行为数据推荐方法,该方法包括:In a first aspect, an embodiment of the present application provides a method for recommending user behavior data. The method includes:
读取用户行为信息,对所述用户行为信息进行数据清洗生成用户行为数据;Reading user behavior information, and performing data cleaning on the user behavior information to generate user behavior data;
对所述用户行为数据进行内容识别处理形成规则库,所述规则库用于将上述用户行为数据进行分类存储;Performing content recognition processing on the user behavior data to form a rule base, where the rule base is used to classify and store the user behavior data;
根据前端推荐系统的条件从所述规则库中提取符合条件的推荐数据,将所述推荐数据推送到所述前端推荐系统。Extract the qualified recommendation data from the rule base according to the conditions of the front-end recommendation system, and push the recommendation data to the front-end recommendation system.
第二方面,本申请实施例提供了一种服务器,该服务器包括:In a second aspect, an embodiment of the present application provides a server, where the server includes:
数据清洗单元,用于读取用户行为信息,对所述用户行为信息进行数据清洗生成用户行为数据;A data cleaning unit, configured to read user behavior information, and perform data cleaning on the user behavior information to generate user behavior data;
规则库生成单元,用于对所述用户行为数据进行内容识别处理形成规则库,所述规则库用于将上述用户行为数据进行分类存储;A rule base generating unit, configured to perform content recognition processing on the user behavior data to form a rule base, where the rule base is used to classify and store the user behavior data;
提取单元,用于根据前端推荐系统的条件从所述规则库中提取推荐数据;An extraction unit, configured to extract recommendation data from the rule base according to conditions of a front-end recommendation system;
推荐单元,用于将所述推荐数据推送到前端推荐系统。A recommendation unit, configured to push the recommendation data to a front-end recommendation system.
第三方面,本申请实施例提供了另一服务器,包括处理器、存储器和通信模块,其中,所述存储器用于存储程序代码,所述处理器用于调用所述程序代码来执行上述第一方面及其任一种可选方式的方法。In a third aspect, an embodiment of the present application provides another server, including a processor, a memory, and a communication module. The memory is used to store program code, and the processor is used to call the program code to execute the first aspect. And any of its alternatives.
第四方面,本申请实施例提供了一种计算机可读存储介质,所述计算机存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行上述第一方面及其任一种可选方式的方法。In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium. The computer storage medium stores a computer program, where the computer program includes program instructions, and the program instructions cause the processing when executed by a processor. The processor executes the method of the first aspect and any one of its optional ways.
在本申请实施例中,通过实时获取用户的行为信息,然后对获取到的用户行为信息进行数据清洗得到没有脏数据的用户行为数据,接着对所述用户行为数据进行内容识别,并整理分类的到规则库;最后根据前端推荐系统的条件从上述规则库中提取前端推荐系统需要的推荐数据,并将所述推荐数据推送到相应的推荐系统,以使的前端推荐系统得到该推荐系统需要的且没有脏数据的推荐数据,从而使得所述前端推荐系统实现实时精准的推荐。In the embodiment of the present application, user behavior information is obtained in real time, and then the obtained user behavior information is subjected to data cleaning to obtain user behavior data without dirty data, and then content identification is performed on the user behavior data, and the classified To the rule base; finally, extract the recommendation data required by the front-end recommendation system from the above rule base according to the conditions of the front-end recommendation system, and push the recommendation data to the corresponding recommendation system, so that the front-end recommendation system obtains the recommendation system And there is no recommendation data of dirty data, so that the front-end recommendation system realizes accurate and real-time recommendation.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍。In order to explain the technical solution of the embodiment of the present application more clearly, the drawings used in the description of the embodiment will be briefly introduced below.
图1是本申请实施例提供的一种用户行为数据推荐方法的示意流程图;FIG. 1 is a schematic flowchart of a user behavior data recommendation method according to an embodiment of the present application; FIG.
图2是本申请实施例提供的另一种用户行为数据推荐方法的示意流程图;2 is a schematic flowchart of another user behavior data recommendation method according to an embodiment of the present application;
图3是本申请实施例提供的一种服务器的示意框图;3 is a schematic block diagram of a server according to an embodiment of the present application;
图4是本申请实施例提供的一种设备;4 is a device provided by an embodiment of the present application;
图5是本申请实施例提供的一种服务器结构示意图。FIG. 5 is a schematic structural diagram of a server according to an embodiment of the present application.
具体实施方式detailed description
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳 动前提下所获得的所有其他实施例,都属于本申请保护的范围。In the following, the technical solutions in the embodiments of the present application will be clearly and completely described with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.
应当理解,当在本说明书和所附权利要求书中使用时,术语“包括”和“包含”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。It should be understood that when used in this specification and the appended claims, the terms "including" and "comprising" indicate the presence of described features, integers, steps, operations, elements and / or components, but do not exclude one or The presence or addition of a number of other features, wholes, steps, operations, elements, components, and / or sets thereof.
还应当理解,在此本申请说明书中所使用的术语仅仅是出于描述特定实施例的目的而并不意在限制本申请。如在本申请说明书和所附权利要求书中所使用的那样,除非上下文清楚地指明其它情况,否则单数形式的“一”、“一个”及“该”意在包括复数形式。It should also be understood that the terminology used in the description of this application is for the purpose of describing particular embodiments only and is not intended to limit the application. As used in this specification and the appended claims, the singular forms "a", "an" and "the" are intended to include the plural forms unless the context clearly indicates otherwise.
还应当进一步理解,在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。It should be further understood that the term "and / or" used in the specification of the application and the appended claims refers to any combination of one or more of the items listed in association and all possible combinations, and includes these combinations .
参见图1,图1是本申请实施例提的供一种用户行为数据推荐方法的示意流程图,如图所示该方法可包括:Referring to FIG. 1, FIG. 1 is a schematic flowchart of a method for recommending user behavior data according to an embodiment of the present application. As shown in the figure, the method may include:
101:读取用户行为信息,对上述用户行为信息进行数据清洗以生成用户行为数据。101: Read user behavior information, and perform data cleaning on the user behavior information to generate user behavior data.
在本申请实施例中,上述用户行为是由时间、地点、人物、交互、交互内容五个元素构成的事件,比如用户搜索是一个事件,在什么时间、什么平台上、哪一个互联网协议(Internet Protocol,IP)地址、做了搜索、搜索的内容是什么。这是一个完整的事件,也是对用户行为的一个定义;我们可以在网站或者是应用程序(Application,APP)中定义千千万万个这样的事件。有了这样的事件以后,就可以把用户行为连起来观察。用户首次进入网站后就是一个新用户,他可能要注册,因此注册行为也是一个事件。注册要填写个人信息,之后他可能开始搜索买东西,所有这些都是用户行为信息。In the embodiment of the present application, the above-mentioned user behavior is an event consisting of time, place, person, interaction, and interactive content. For example, a user search is an event, at what time, on which platform, and which Internet protocol (Internet Protocol (IP) address, what was searched, and what was searched for. This is a complete event and a definition of user behavior; we can define tens of thousands of such events in a website or an application (Application, APP). With such events, you can connect user behavior and observe. A user is a new user after entering the website for the first time, and he may want to register, so the registration behavior is also an event. Registration requires filling in personal information, after which he may start searching for purchases, all of which is user behavior information.
当需要对用户行为信息进行实时分析处理,以便得到有用的用户行为数据,并根据用户的行为数据向用户进行相应的推荐。在本申请实施例中,通过实时读取用户的用户行为信息,然后对读取到的用户行为信息进行数据清洗处理得到用户行为数据。其中,上述对读取到的用户行为信息进行数据清洗处理可以包括对上述用户行为信息进行错误数据清洗、缺失值数据清洗、重复值数据清 洗或不一致性数据清洗等操作。When user behavior information needs to be analyzed and processed in real time, in order to obtain useful user behavior data, and make corresponding recommendations to users based on the user's behavior data. In the embodiment of the present application, the user behavior data is obtained by reading the user behavior information of the user in real time, and then performing data cleaning processing on the read user behavior information. The above-mentioned data cleaning processing on the read user behavior information may include operations such as error data cleaning, missing value data cleaning, repeated value data cleaning, or inconsistent data cleaning on the user behavior information.
具体的,主要通过数据流实时处理框架storm框架的数据获取组件spout来拉取用户行为信息数据,然后由spout将拉取到的用户行为信息数据按照预设规则分发给storm中的数据处理者bolt来对用户行为数据进行各种处理,例如对用户行为信息进行过滤清洗等。由于上述spout拉取的用户行为数据中可能会存在很多脏数据,例如上述用户行为数据中可能会存在一些重复数据、错误数据、残缺数据等。因此,当获取到上述用户行为信息数据之后首先要对上述用户行为信息数据进行清洗加工处理。具体的,上述对用户行为信息数据的清洗包括错误数据清洗、缺失值清洗、重复值清洗以及不一致性数据清洗。当对上述用户行为信息数据清洗完之后,将清洗之后的用户行为信息数据传递给下一个bolt组件继续处理。Specifically, the data acquisition component spout of the storm framework, a real-time processing framework for data streams, is used to pull user behavior information data, and then spout distributes the pulled user behavior information data to the data processor bolt in storm according to preset rules. To perform various processing on user behavior data, such as filtering and cleaning user behavior information. Because there may be many dirty data in the user behavior data pulled by the above spout, for example, there may be some duplicate data, erroneous data, and incomplete data in the user behavior data. Therefore, after the user behavior information data is obtained, the user behavior information data is first subjected to a cleaning process. Specifically, the foregoing cleaning of user behavior information data includes cleaning of erroneous data, cleaning of missing values, cleaning of duplicate values, and cleaning of inconsistent data. After the user behavior information data is cleaned, the cleaned user behavior information data is passed to the next bolt component to continue processing.
其中,上述storm是一个免费开源、分布式、高容错的实时处理框架。Storm支持创建拓扑结构来转换没有终点的数据流。Storm经常用于在实时分析、在线机器学习、持续计算、分布式远程调用等领域。上述spout是storm中产生源数据流的组件,通常情况下spout会从外部数据源中读取数据,然后转换为内部的源数据。上述bolt是storm中执行数据处理的组件,可以执行过滤、函数操作、合并、写数据库等任何操作。Among them, the above storm is a free open source, distributed, highly fault-tolerant real-time processing framework. Storm supports creating topologies to transform data streams without end points. Storm is often used in the fields of real-time analysis, online machine learning, continuous computing, distributed remote calls, and so on. The above spout is a component that generates a source data stream in storm. Generally, spout reads data from an external data source and then converts it into internal source data. The above bolt is a component that performs data processing in Storm, and can perform any operation such as filtering, function operations, merging, and writing to a database.
作为一种可选的实施方式,在读取用户行为信息之前,需要对用户行为进行跟踪,即对用户的用户行为信息进行收集。其中,用户的用户行为收集可以包括:基于Web服务器日志或客户端来收集用户行为信息。其中,基于Web服务器日志来收集用户行为信息的方式比较普遍,日志文件由web服务器自动生成,花费成本小,开发基于日志文件的数据分析工具相对比较容易;客户端收集用户行为数据是指采用一定的技术方法直接从客户端获得用户与网站的交互情况的行为数据,从客户端收集用户行为信息数据可以减少人为干扰因素,获得的数据更加真实准确,解决了服务器端收集所存在的不足,降低服务器的消耗。在收集到用户行为信息后,将上述用户行为信息发送到消息队列中,以便上述实时处理框架strom从消息队列中拉取上述用户行为信息。As an optional implementation manner, before the user behavior information is read, the user behavior needs to be tracked, that is, the user behavior information of the user is collected. The collection of user behavior of the user may include collecting user behavior information based on a web server log or a client. Among them, the method of collecting user behavior information based on the Web server logs is relatively common. The log file is automatically generated by the web server, and the cost is small. It is relatively easy to develop a data analysis tool based on the log files. The technical method directly obtains the behavior data of the user's interaction with the website from the client. Collecting user behavior information data from the client can reduce human interference factors, and the obtained data is more true and accurate, which solves the shortcomings of server-side collection and reduces Consumption of servers. After the user behavior information is collected, the user behavior information is sent to a message queue, so that the real-time processing framework strom pulls the user behavior information from the message queue.
Tracker系统是一套专业化的工装及刀具、夹具和量具智能数据库管理系统,它能对企业生产过程中的工装、刀具、夹具和量具进行整体的流程化管理,通过实时跟踪工装及刀夹量具的采购、出入库、修磨、报废、校准等过程,帮 助库管员、工艺员、制造工程师和工装及刀夹量具主管等更有效地改善刀具管理过程,降低生产成本。Tracker system is a set of professional tooling and intelligent database management system of tools, fixtures and measuring tools. It can manage the overall process of tooling, tools, fixtures and measuring tools in the production process of the enterprise, and track tooling and toolholder measuring tools in real time The process of purchasing, loading and unloading, repairing, scrapping, and calibration helps warehouse managers, craftsmen, manufacturing engineers, tooling and tool holder gauge supervisors, etc. to more effectively improve the tool management process and reduce production costs.
作为一种可选的实施方式,可以采用Tracker系统来收集上述用户行为信息。当网站或者APP到达一定的用户量后,一般需要一套Tracker系统,收集用户行为(如用户IP地址、页面来源、城市名、浏览器版本、按钮位置等)、页面访问性能、异常出错等信息,然后根据一定的策略上报到日志服务器。搜索、推荐、广告中心等开发团队分析这些日志,可以调整和开发各种功能;产品经理、高级管理人员等通过这些日志及时优化营运并进行正确决策。Tracker系统在一个成熟的应用中扮演着重要的角色,随着业务的发展,对它的实时性要求也越来越高。As an optional implementation manner, a Tracker system may be adopted to collect the foregoing user behavior information. When a website or app reaches a certain number of users, a tracker system is generally required to collect user behavior (such as user IP address, page source, city name, browser version, button location, etc.), page access performance, and abnormal errors. , And then report to the log server according to a certain policy. Search, recommendation, advertising center and other development teams analyze these logs to adjust and develop various functions; product managers, senior management, etc. use these logs to optimize operations and make correct decisions in a timely manner. The Tracker system plays an important role in a mature application. With the development of the business, the real-time requirements for it are getting higher and higher.
Tracker系统支持自动打点字段、自动扩展字段等,在网站或者应用的各个页面的事件中嵌入Tracker系统的应用程序编程接口(Application Programming Interface,API),设置一定的策略发送到日志服务器,然后再同步到作为数据缓冲区的消息队列。通过Storm框架从消息队列中拉取消息,完成相关的过滤和计算,最后存到数据库中(例如,Hbase、MYSQL等)。其中,Hbase是一个分布式的、面向列的开源数据库;MYSQL是一个关系型数据库。The Tracker system supports automatic dotted fields, automatic extension fields, and so on. The Tracker system's Application Programming Interface (API) is embedded in the event of each page of the website or application. A certain policy is set to send to the log server, and then synchronized. To the message queue as a data buffer. Pull messages from the message queue through the Storm framework, complete related filtering and calculations, and finally store them in the database (for example, Hbase, MYSQL, etc.). Among them, Hbase is a distributed, column-oriented open source database; MYSQL is a relational database.
用户访问会源源不断地产生数据,数据要么存储在本地并在需要时发送到相关的应用,要么存储到一个统一的中央存储区中。产生的数据会被Storm中的Spout抓取、过滤并进行相关处理(例如应用之间协议解析、格式分析、数据校验等),然后发送到Bolt中进行数据分析,最终形成可用数据并存储到持久化介质(如DB)中,供其他应用获取。User access will continuously generate data, which is either stored locally and sent to related applications when needed, or stored in a unified central storage area. The generated data will be captured, filtered, and processed by the Spout in Storm (such as protocol analysis, format analysis, data verification, etc.) between applications, and then sent to Bolt for data analysis. Finally, the available data is formed and stored in In a persistent medium (such as a DB) for other applications to obtain.
102:对上述用户行为数据进行内容识别处理形成规则库,所述规则库用于将上述用户行为数据进行分类存储。102: Perform content recognition processing on the user behavior data to form a rule base, where the rule base is used to classify and store the user behavior data.
在本申请实施例中,在上述用户行为信息经过清洗加工后,对上述清洗后的用户行为信息进行内容识别处理,以便根据用户行为信息的具体内容和预设规则将用户行为信息进行分类,然后将用户行为信息按照预设规则的分类存放到数据库性成规则库。In the embodiment of the present application, after the user behavior information is subjected to cleaning processing, content recognition processing is performed on the cleaned user behavior information, so as to classify the user behavior information according to the specific content of the user behavior information and a preset rule, and then The user behavior information is stored in a database into a rule base according to the classification of preset rules.
具体的,上述根据用户行为信息的具体内容和预设规则将用户行为信息进行分类可以包括:将上述清洗后的用户行为信息进行内容识别,将清洗后的用户行为信息按照浏览行为、点击行为、输入行为或搜索行为进行分类,然后将 其发送到下一层相应的bolt进行处理(例如结构化处理)。当上述下一层相应的bolt对用户行为信息进行处理之后,便将处理后的用户行为信息持久化存储到一个数据库形成规则库。Specifically, the classifying the user behavior information according to the specific content and preset rules of the user behavior information may include: identifying the cleaned user behavior information, and classifying the cleaned user behavior information according to browsing behavior, click behavior, The input or search behavior is classified, and then sent to the corresponding bolt for processing (such as structured processing). After the corresponding bolt of the next layer above processes the user behavior information, the processed user behavior information is persistently stored in a database to form a rule base.
例如,上述清洗后的用户行为信息中有一条浏览行为,该浏览行为为用户A浏览了一个电影相关的网页。该条浏览用户行为信息经过内容识别后,将其发送到相应的处理浏览行为的bolt中,bolt根据该条浏览行为的内容将其结构化,具体的,将上述浏览行为按照:用户、网址、主题、类别、作者、导演、主演、发行年代的格式进行结构化,然后将其存入到上述规则库中。For example, there is a browsing behavior in the cleaned user behavior information, and the browsing behavior is that user A browses a movie-related webpage. After the content of the browsing user behavior information is identified, it is sent to the corresponding bolt that handles the browsing behavior. The bolt is structured according to the content of the browsing behavior. Specifically, the browsing behavior is described as follows: user, URL, The format of the theme, category, author, director, starring, release year is structured and then stored in the above rule base.
103:根据前端推荐系统的条件从上述规则库中提取推荐数据,将上述推荐数据推送到上述前端推荐系统。103: Extract recommendation data from the rule base according to the conditions of the front-end recommendation system, and push the recommendation data to the front-end recommendation system.
在本申请实施例中,当大量的用户行为信息存入上述规则库之后,利用轨迹增强算法从上述规则库中按照不同的前端推荐系统的条件从上述规则库中提取符合条件的数据,然后将提取的数据推送到相应的前端推荐系统中。In the embodiment of the present application, after a large amount of user behavior information is stored in the rule base, a trajectory enhancement algorithm is used from the rule base to extract the eligible data from the rule base according to the conditions of different front-end recommendation systems, and then The extracted data is pushed to the corresponding front-end recommendation system.
其中,由于不同的前端推荐系统向用户推荐的内容不同,从而不同的前端推荐系统需要从上述规则库中提取的推荐数据也不同;因此,需要根据不同的前端推荐系统的条件来从上述规则库中提取满足该推荐系统的推荐数据。例如,某一个前端推荐系统的条件是总流量排名前10的网址,则上述推荐数据就是根据算法从规则库中统计的当前排名前十的网址信息。Among them, because different front-end recommendation systems recommend different content to users, different front-end recommendation systems need to extract the recommendation data from the above-mentioned rule base; therefore, it is necessary to obtain from the above-mentioned rule base according to the conditions of different front-end recommendation systems. Extract recommendation data that satisfies the recommendation system. For example, the condition of a front-end recommendation system is the top 10 URLs in terms of total traffic, then the above recommendation data is the current top 10 URL information calculated from the rule base according to the algorithm.
具体的,上述轨迹增强算法的作用是从规则库中提取出海量的数据,其中包括了访问的统一资源定位符(Uniform Resource Locator,URL)及访问流量、访问的关键词信息等,经过轨迹增强算法处理,然后根前端推荐系统的条件,比如将相同的URL合并到一起,将访问的上行流量与下行流量合并并进行排序,然后网址按流量汇总排序出来,取总流量的前80%,因为80%的流量的URL是用户经常访问的URL,进一步将规则库中的数据进行处理后保存到数据库,前端系统从数据库中将这些数据推荐给用户。Specifically, the function of the above-mentioned trajectory enhancement algorithm is to extract a large amount of data from the rule base, which includes the Uniform Resource Locator (URL), the access traffic, and the keyword information of the visit. Algorithm processing, and then root the front-end recommendation system conditions, such as merge the same URLs together, merge the upstream traffic and downstream traffic and sort them, and then sort the URLs based on the traffic summary, taking the first 80% of the total traffic, because The URL of 80% of the traffic is a URL that users often visit. The data in the rule base is further processed and saved to the database. The front-end system recommends these data to the user from the database.
可选的在本申请实施例中,当从上述规则库中提取出符合条件的数据,并将其推送给相应的前端系统之后,将上述已经被提取的数据存储到历史行为轨迹数据库中,并同时将上述规则库中的上述已经被提取的数据删除,以便节省上述规则库的存储空间。Optionally, in the embodiment of the present application, after the qualified data is extracted from the above rule base and pushed to the corresponding front-end system, the already extracted data is stored in the historical behavior trajectory database, and At the same time, the extracted data in the rule base is deleted to save the storage space of the rule base.
可以看出,在本申请实施例中,通过实时获取用户的行为信息,然后对获 取到的用户行为信息进行数据清洗得到没有脏数据的用户行为数据,接着对所述用户行为数据进行内容识别,并整理分类的到规则库;最后根据前端推荐系统的条件从上述规则库中提取前端推荐系统需要的推荐数据,并将所述推荐数据推送到相应的推荐系统,以使的前端推荐系统得到该推荐系统需要的且没有脏数据的推荐数据,从而使得所述前端推荐系统实现实时精准的推荐。It can be seen that, in the embodiment of the present application, user behavior information is obtained in real time, and then the obtained user behavior information is subjected to data cleaning to obtain user behavior data without dirty data, and then content identification is performed on the user behavior data. And sort the classification into the rule base; finally, according to the conditions of the front-end recommendation system, extract the recommendation data required by the front-end recommendation system from the above rule base, and push the recommendation data to the corresponding recommendation system, so that the front-end recommendation system obtains the Recommendation data required by the recommendation system without dirty data, so that the front-end recommendation system implements real-time accurate recommendations.
参见图2,图2是本申请实施例提供的另一种用户行为数据推荐方法的示意流程图,如图所示该方法可包括:Referring to FIG. 2, FIG. 2 is a schematic flowchart of another user behavior data recommendation method according to an embodiment of the present application. As shown in the figure, the method may include:
201:从web日志文件或者从用户终端中获取用户的用户行为信息。201: Obtain user behavior information of a user from a web log file or from a user terminal.
202:将上述用户行为信息发送到消息队列中。202: Send the above user behavior information to a message queue.
在本申请实施例中,当收集到用户的行为信息后,将上述用户行为信息发送到作为缓存区的消息队列中,例如,Kafka或MetaQ等消息队列,以便后续storm框架的spout从消息队列中获取上述用户行为信息。其中,Kafka是一种高吞吐量的分布式发布订阅消息系统,它可以处理消费者规模的网站中的所有动作流数据;MetaQ是一款完全的队列模型消息中间件,服务器使用Java语言编写,可在多种软硬件平台上部署。In the embodiment of the present application, after the user behavior information is collected, the above user behavior information is sent to a message queue serving as a buffer area, for example, a message queue such as Kafka or MetaQ, so that the spout of the subsequent storm framework is removed from the message queue. Get the above user behavior information. Among them, Kafka is a high-throughput distributed publish and subscribe message system, which can process all action flow data in consumer-scale websites; MetaQ is a complete queue model message middleware, and the server is written in Java. Can be deployed on multiple software and hardware platforms.
203:通过storm实时处理框架的数据获取组件spout从上述消息队列中读取上述用户行为信息,然后将上述用户行为信息分发给处理组件bolt。203: The data acquisition component spout of the storm real-time processing framework reads the user behavior information from the message queue, and then distributes the user behavior information to the processing component bolt.
在本申请实施例中,当上述获取组件spout从上述消息队列中读取上述用户行为信息后,根据获取到的用户行为信息的类型将上述用户行为信息分发到不同的bolt进行处理。In the embodiment of the present application, after the obtaining component spout reads the user behavior information from the message queue, the user behavior information is distributed to different bolts for processing according to the type of the obtained user behavior information.
204:所述实时处理框架的上述处理组件bolt对上述用户行为信息进行错误数据清洗、缺失值数据清洗、重复值数据清洗或不一致性数据清洗得到用户行为数据。204: The processing component bolt of the real-time processing framework performs error data cleaning, missing value data cleaning, repeated value data cleaning, or inconsistent data cleaning on the user behavior information to obtain user behavior data.
在本申请实施例中,由于上述spout拉取的用户行为数据中可能会存在很多脏数据,例如上述用户行为数据中可能会存在一些重复数据、错误数据、残缺数据等。因此,当获取到上述用户行为信息数据之后首先要对上述用户行为信息数据进行清洗加工处理。具体的,上述对用户行为信息数据的清洗包括错误数据清洗、缺失值清洗、重复值清洗以及不一致性数据清洗。当对上述用户行为信息数据清洗完之后,将清洗之后的用户行为信息数据传递给下一个bolt继续处理。In the embodiment of the present application, there may be a lot of dirty data in the user behavior data pulled by the spout, for example, there may be some duplicate data, erroneous data, and incomplete data in the user behavior data. Therefore, after the user behavior information data is obtained, the user behavior information data is first subjected to a cleaning process. Specifically, the foregoing cleaning of user behavior information data includes cleaning of erroneous data, cleaning of missing values, cleaning of duplicate values, and cleaning of inconsistent data. After the above user behavior information data is cleaned, the cleaned user behavior information data is passed to the next bolt to continue processing.
205:在上述用户行为数据进行内容识别处理后,将上述用户行为数据按照浏览行为、点击行为、输入行为或搜索行为进行分类。205: After the user behavior data is subjected to content recognition processing, the user behavior data is classified according to browsing behavior, click behavior, input behavior, or search behavior.
在本申请实施例中,在上述用户行为信息经过清洗加工后,对上述清洗后的用户行为信息进行内容识别处理,以便根据用户行为信息的具体内容和预设规则将用户行为信息进行分类,然后将用户行为信息按照预设规则的分类存放到数据库形成规则库。In the embodiment of the present application, after the user behavior information is subjected to cleaning processing, content recognition processing is performed on the cleaned user behavior information, so as to classify the user behavior information according to the specific content of the user behavior information and a preset rule, and then The user behavior information is stored in the database according to the classification of the preset rules to form a rule base.
具体的,上述根据用户行为信息的具体内容和预设规则将用户行为信息进行分类可以包括:将上述清洗后的用户行为信息进行内容识别,将清洗后的用户行为信息按照浏览行为、点击行为、输入行为(或搜索行为)进行分类,然后将其发送到下一层相应的bolt进行处理(例如结构化处理)。Specifically, the classifying the user behavior information according to the specific content and preset rules of the user behavior information may include: identifying the cleaned user behavior information, and classifying the cleaned user behavior information according to browsing behavior, click behavior, The input behavior (or search behavior) is classified, and then sent to the corresponding bolt for processing (such as structured processing).
206:将分类后的上述用户行为数据结构化处理后进行存储形成规则库。206: The classified user behavior data is structured and stored to form a rule base.
在本申请实施例中,当相应的bolt对用户行为信息进行处理之后,便将处理后的用户行为信息持久化存储到一个数据库形成规则库。In the embodiment of the present application, after the corresponding bolt processes the user behavior information, the processed user behavior information is persistently stored in a database to form a rule base.
例如,上述清洗后的用户行为信息中有一条浏览行为,该浏览行为为用户A浏览了一个电影相关的网页。该条浏览用户行为信息经过内容识别后,将其发送到相应的处理浏览行为的bolt中,bolt根据该条浏览行为的内容将其结构化,具体的,将上述浏览行为按照:用户、网址、主题、类别、作者、导演、主演、发行年代的格式进行结构化,然后将其存入到上述规则库中。For example, there is a browsing behavior in the cleaned user behavior information, and the browsing behavior is that user A browses a movie-related webpage. After the content of the browsing user behavior information is identified, it is sent to the corresponding bolt that handles the browsing behavior. The bolt is structured according to the content of the browsing behavior. Specifically, the browsing behavior is described as follows: user, URL, The format of the theme, category, author, director, starring, release year is structured and then stored in the above rule base.
207:利用轨迹增强算法从上述规则库中按照不同的前端推荐系统的条件从上述规则库中提取符合条件的推荐数据,将上述推荐数据推送到对应的前端推荐系统。207: Use a trajectory enhancement algorithm to extract qualified recommendation data from the rule base according to the conditions of different front-end recommendation systems from the rule base, and push the recommendation data to the corresponding front-end recommendation system.
在本申请实施例中,当大量的用户行为信息存入上述规则库之后,利用轨迹增强算法从上述规则库中按照不同的前端推荐系统的条件从上述规则库中提取符合条件的数据,然后将提取的数据推送到相应的前端推荐系统中。In the embodiment of the present application, after a large amount of user behavior information is stored in the rule base, a trajectory enhancement algorithm is used from the rule base to extract the eligible data from the rule base according to the conditions of different front-end recommendation systems, and then The extracted data is pushed to the corresponding front-end recommendation system.
具体的,上述轨迹增强算法的作用是从规则库中提取出海量的数据,其中包括了访问的URL及访问流量、访问的关键词信息等,经过轨迹增强算法处理,然后根据一定条件,比如将相同的URL合并到一起,将访问的上行流量与下行流量合并并进行排序,然后网址按流量汇总排序出来,取总流量的前80%,因为80%的流量的URL是用户经常访问的URL,进一步将规则库中的数据进行处理后保存到数据库,前端系统从数据库中将这些数据推荐给用户。Specifically, the function of the above trajectory enhancement algorithm is to extract a large amount of data from the rule base, which includes the accessed URL and access traffic, the keyword information of the access, etc., and then processed by the trajectory enhancement algorithm, and then based on certain conditions, such as The same URLs are merged together, and the upstream traffic and downstream traffic that are accessed are merged and sorted, and then the URLs are sorted by the traffic summary, taking the first 80% of the total traffic, because the URLs of 80% of the traffic are URLs that users often visit, The data in the rule base is further processed and saved to the database, and the front-end system recommends these data to the user from the database.
208:将上述推荐数据存储到历史行为轨迹数据库中。208: Store the above recommendation data in a historical behavior trajectory database.
在本申请实施例中,对于上述已经被推荐的用户行为数据,在后续可能会对用户的历史行为进行分析处理,因此,在上述将用户行为数据推荐给相应的前端推荐系统之后,将上述推荐数据存储到历史行为轨迹数据库中,以便后续从历史行为轨迹数据库中获取上述用户行为数据进行分析处理。In the embodiment of the present application, the previously recommended user behavior data may be analyzed and processed on the user ’s historical behavior in the future. Therefore, after the user behavior data is recommended to the corresponding front-end recommendation system, the above recommendation is recommended. The data is stored in the historical behavior trajectory database, so that the above-mentioned user behavior data is subsequently obtained from the historical behavior trajectory database for analysis and processing.
209:将上述规则库中的上述推荐数据删除。209: Delete the above recommended data in the above rule base.
在本申请实施例中,为了节省规则库的存储空间,当规则库的用户行为数据别提取后,则将已经被提取过的用户行为数据从上述规则库中删除。In the embodiment of the present application, in order to save the storage space of the rule base, after the user behavior data of the rule base is extracted, the extracted user behavior data is deleted from the above rule base.
可以看出,本申请实施例通过web日志文件或用户终端来收集用户的用户行为信息,然后将上述用户行为信息发送给消息队列,接着从上述消息队列中读取用户行为信息,对上述用户行为信息进行数据清洗生成用户行为数据,以便将用户行为信息中的重复数据、错误数据删除,以及将残缺数据补全;然后,对上述用户行为数据进行内容识别,以及分类排序等处理形成规则库;从上述规则库中提取符合条件的推荐数据,将上述推荐数据推送到前端推荐系统。通过本申请实施例,可实现在推荐用户行为数据时,能够实现实时精准推荐,且使得推荐的用户行为数据不存在脏数据。It can be seen that the embodiment of the present application collects user behavior information of a user through a web log file or a user terminal, and then sends the user behavior information to a message queue, and then reads the user behavior information from the message queue, and analyzes the user behavior. The information is subjected to data cleaning to generate user behavior data, so as to delete duplicate data, erroneous data, and complete incomplete data in the user behavior information; then, perform content identification, classification and sorting on the above user behavior data to form a rule base; Extract the qualified recommendation data from the above rule base and push the above recommendation data to the front-end recommendation system. Through the embodiments of the present application, when recommending user behavior data, real-time and accurate recommendation can be achieved, and the recommended user behavior data does not contain dirty data.
本申请实施例还提供一种服务器,该服务器用于执行前述任一项上述的方法的单元。具体地,参见图3,图3是本申请实施例提供的一种服务器的示意框图。本实施例的服务器包括:数据清洗单元310、规则库生成单元320、提取单元330以及推荐单元340。An embodiment of the present application further provides a server, which is configured to execute a unit of any one of the foregoing methods. Specifically, referring to FIG. 3, FIG. 3 is a schematic block diagram of a server provided by an embodiment of the present application. The server in this embodiment includes a data cleaning unit 310, a rule base generation unit 320, an extraction unit 330, and a recommendation unit 340.
数据清洗单元310,用于读取用户行为信息,对上述用户行为信息进行数据清洗生成用户行为数据;The data cleaning unit 310 is configured to read user behavior information, and perform data cleaning on the user behavior information to generate user behavior data.
规则库生成单元320,用于对上述用户行为数据进行内容识别处理形成规则库,上述规则库用于将上述用户行为数据进行分类存储;A rule base generating unit 320 is configured to perform content recognition processing on the user behavior data to form a rule base, and the rule base is used to classify and store the user behavior data;
提取单元330,用于根据前端推荐系统的条件从上述规则库中提取符合条件推荐数据;An extraction unit 330, configured to extract qualified recommendation data from the above rule base according to the conditions of the front-end recommendation system;
推荐单元340,用于将上述推荐数据推送到前端推荐系统。The recommendation unit 340 is configured to push the above recommendation data to a front-end recommendation system.
可以看出,本申请实施例通过读取用户行为信息,对上述用户行为信息进行数据清洗生成用户行为数据,以便将用户行为信息中的重复数据、错误数据删除,以及将残缺数据补全;然后,对上述用户行为数据进行内容识别处理形 成规则库;从上述规则库中提取符合条件的推荐数据,将上述推荐数据推送到前端推荐系统。通过本申请实施例,可实现在推荐用户行为数据时,能够实现实时精准推荐,且使得推荐的用户行为数据不存在脏数据。It can be seen that the embodiment of the present application generates user behavior data by reading the user behavior information and performing data cleaning on the above user behavior information, so as to delete duplicate data, erroneous data, and complete incomplete data in the user behavior information; then , Performing content recognition processing on the user behavior data to form a rule base; extracting qualified recommendation data from the rule base, and pushing the recommendation data to a front-end recommendation system. Through the embodiments of the present application, when recommending user behavior data, real-time and accurate recommendation can be achieved, and the recommended user behavior data does not contain dirty data.
可选的,上述服务器还包括:Optionally, the server further includes:
获取单元350,用于从web日志文件中获取用户行为信息,或者从用户终端直接获取上述用户行为信息;An obtaining unit 350, configured to obtain user behavior information from a web log file, or directly obtain the user behavior information from a user terminal;
发送单元360,用于将上述用户行为信息发送到消息队列中。The sending unit 360 is configured to send the user behavior information to a message queue.
可选的,所述数据清洗包括:错误数据清洗、缺失值数据清洗、重复值数据清洗或不一致性数据清洗。Optionally, the data cleaning includes: erroneous data cleaning, missing value data cleaning, repeated value data cleaning, or inconsistent data cleaning.
可选的,上述数据清洗单元310包括:Optionally, the data cleaning unit 310 includes:
读取单元311,用于通过storm实时处理框架的数据获取组件spout从上述消息队列中读取上述用户行为信息,然后将上述用户行为信息分发给处理组件bolt;The reading unit 311 is configured to read the user behavior information from the message queue through the data acquisition component spout of the storm real-time processing framework, and then distribute the user behavior information to the processing component bolt;
分发单元312,用于将上述用户行为信息分发给所述实时处理框架的处理组件;A distribution unit 312, configured to distribute the user behavior information to a processing component of the real-time processing framework;
清洗单元313,用于所述实时处理框架的处理组件bolt对上述用户行为信息进行错误数据清洗、缺失值数据清洗、重复值数据清洗或不一致性数据清洗得到用户行为数据。The cleaning unit 313 is configured to use the processing component bolt of the real-time processing framework to perform error data cleaning, missing value data cleaning, repeated value data cleaning, or inconsistent data cleaning on the user behavior information to obtain user behavior data.
可选的,所述读取单元,用于通过Storm框架的spout组件从上述消息队列中拉取所述用户行为信息;Optionally, the reading unit is configured to pull the user behavior information from the message queue through a spout component of the Storm framework;
所述分发单元,用于将所述用户行为信息分发给Storm框架的Bolt组件。The distribution unit is configured to distribute the user behavior information to a Bolt component of a Storm framework.
可选的,上述规则库生成单元320包括:Optionally, the rule base generating unit 320 includes:
分类单元321,用于在上述用户行为数据进行内容识别处理后,将上述用户行为数据按照浏览行为、点击行为、输入行为或搜索行为进行分类;A classification unit 321, configured to classify the user behavior data according to a browsing behavior, a click behavior, an input behavior, or a search behavior after the user behavior data performs content recognition processing;
处理单元322,用于将分类后的上述用户行为数据结构化处理;A processing unit 322, configured to process the classified user behavior data in a structured manner;
第一存储单元323,将结构化处理的上述用户行为数据进行存储形成上述规则库。The first storage unit 323 stores the user behavior data that is structured and processed to form the rule base.
可选的,上述提取单元330,用于利用轨迹增强算法从上述规则库中按照不同的前端推荐系统的条件从上述规则库中提取符合条件的推荐数据;Optionally, the above-mentioned extraction unit 330 is configured to use a trajectory enhancement algorithm to extract the recommendation data that meets the conditions from the rule base according to the conditions of different front-end recommendation systems from the rule base;
上述推荐单元340,用于将上述推荐数据推送到对应的前端推荐系统。The recommendation unit 340 is configured to push the recommendation data to a corresponding front-end recommendation system.
可选的,所述前端推荐系统的条件可以包括将相同的URL合并到一起,将访问的上行流量与下行流量合并并进行排序,然后网址按流量汇总排序出来,取总流量的在前预设百分比的网址。Optionally, the conditions of the front-end recommendation system may include merging the same URLs together, merging and sorting the upstream and downstream traffic visited, and then sorting the URLs based on the traffic summary, and taking the previous preset of the total traffic Percent URL.
可选的,上述服务器还包括:Optionally, the server further includes:
第二存储单元370,用于将上述推荐数据存储到历史行为轨迹数据库中;A second storage unit 370, configured to store the above-mentioned recommendation data in a historical behavior trajectory database;
删除单元380,用于将上述规则库中的上述推荐数据删除。The deleting unit 380 is configured to delete the recommended data in the rule base.
可以看出,本申请实施例通过web日志文件或用户终端来收集用户的用户行为信息,然后将上述用户行为信息发送给消息队列,接着从上述消息队列中读取用户行为信息,对上述用户行为信息进行数据清洗生成用户行为数据,以便将用户行为信息中的重复数据、错误数据删除,以及将残缺数据补全;然后,对上述用户行为数据进行内容识别,以及分类排序等处理形成规则库;从上述规则库中提取符合条件的推荐数据,将上述推荐数据推送到前端推荐系统。通过本申请实施例,可实现在推荐用户行为数据时,能够实现实时精准推荐,且使得推荐的用户行为数据不存在脏数据。It can be seen that the embodiment of the present application collects user behavior information of a user through a web log file or a user terminal, and then sends the user behavior information to a message queue, and then reads the user behavior information from the message queue, and analyzes the user behavior. The information is subjected to data cleaning to generate user behavior data, so as to delete duplicate data, erroneous data, and complete incomplete data in the user behavior information; then, perform content identification, classification and sorting on the above user behavior data to form a rule base; Extract the qualified recommendation data from the above rule base and push the above recommendation data to the front-end recommendation system. Through the embodiments of the present application, when recommending user behavior data, real-time and accurate recommendation can be achieved, and the recommended user behavior data does not contain dirty data.
参见图4,图4是本申请实施例提供的一种设备,该设备可以为服务器,如图4所示设备包括:一个或多个处理器401;一个或多个输入设备402,一个或多个输出设备403和存储器404。上述处理器401、输入设备402、输出设备403和存储器404通过总线405连接。存储器402用于存储指令,处理器401用于执行存储器402存储的指令。Referring to FIG. 4, FIG. 4 is a device provided by an embodiment of the present application. The device may be a server. As shown in FIG. 4, the device includes: one or more processors 401; one or more input devices 402, and one or more Output devices 403 and memory 404. The processor 401, the input device 402, the output device 403, and the memory 404 are connected through a bus 405. The memory 402 is configured to store instructions, and the processor 401 is configured to execute the instructions stored in the memory 402.
其中,该设备作为服务器使用的情况下,处理器401用于:读取用户行为信息,对上述用户行为信息进行数据清洗以生成用户行为数据;对上述用户行为数据进行内容识别处理形成规则库,上述规则库用于将上述用户行为数据进行分类存储;根据前端推荐系统的条件从上述规则库中提取推荐数据,将上述推荐数据推送到前端推荐系统。Where the device is used as a server, the processor 401 is configured to: read user behavior information, perform data cleaning on the user behavior information to generate user behavior data; perform content recognition processing on the user behavior data to form a rule base, The rule base is used to classify and store the user behavior data; extract recommendation data from the rule base according to the conditions of the front-end recommendation system, and push the recommendation data to the front-end recommendation system.
应当理解,在本申请实施例中,所称处理器401可以是中央处理单元(Central Processing Unit,CPU),该处理器还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。It should be understood that, in the embodiment of the present application, the processor 401 may be a central processing unit (CPU), and the processor may also be another general-purpose processor or a digital signal processor (DSP). Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
输入设备402可以包括触控板、指纹采传感器(用于采集用户的指纹信息和指纹的方向信息)、麦克风等,输出设备403可以包括显示器(例如,液晶显示器Liquid Crystal Display,LCD)等)、扬声器等。The input device 402 may include a touchpad, a fingerprint sensor (for collecting user's fingerprint information and fingerprint orientation information), a microphone, etc., and the output device 403 may include a display (for example, a Liquid Crystal Display (LCD), etc.), Speakers, etc.
该存储器404可以包括只读存储器和随机存取存储器,并向处理器401提供指令和数据。存储器404的一部分还可以包括非易失性随机存取存储器。例如,存储器404还可以存储设备类型的信息。The memory 404 may include a read-only memory and a random access memory, and provide instructions and data to the processor 401. A portion of the memory 404 may also include non-volatile random access memory. For example, the memory 404 may also store device type information.
具体实现中,本申请实施例中所描述的处理器401、输入设备402、输出设备403可执行本申请实施例提供的一种用户行为数据推荐方法的第一实施例中所描述的实现方式、第二实施列以及第三实施例中的实现方式,也可执行本申请实施例所描述的服务器的实现方式,在此不再赘述。In specific implementation, the processor 401, the input device 402, and the output device 403 described in the embodiments of the present application may execute the implementation manner described in the first embodiment of a user behavior data recommendation method provided in the embodiments of the present application, The implementation manners in the second embodiment and the third embodiment may also implement the server implementation manners described in the embodiments of the present application, and details are not described herein again.
在本申请的另一实施例中提供一种计算机可读存储介质,上述计算机可读存储介质存储有计算机程序,上述计算机程序被处理器执行时实现:读取用户行为信息,对上述用户行为信息进行数据清洗以生成用户行为数据;对上述用户行为数据进行内容识别处理形成规则库,上述规则库用于将上述用户行为数据进行分类存储;根据前端推荐系统的条件从上述规则库中提取推荐数据,将上述推荐数据推送到前端推荐系统。In another embodiment of the present application, a computer-readable storage medium is provided. The computer-readable storage medium stores a computer program, and the computer program is implemented when a processor executes: reading user behavior information, and reading the user behavior information. Perform data cleaning to generate user behavior data; perform content recognition processing on the user behavior data to form a rule base, the rule base is used to classify and store the user behavior data; and extract recommendation data from the rule base according to the conditions of the front-end recommendation system , Push the above recommendation data to the front-end recommendation system.
上述计算机可读存储介质可以是前述任一实施例上述的终端的内部存储单元,例如终端的硬盘或内存。上述计算机可读存储介质也可以是上述终端的外部存储设备,例如上述终端上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,上述计算机可读存储介质还可以既包括上述终端的内部存储单元也包括外部存储设备。上述计算机可读存储介质用于存储上述计算机程序以及上述终端所需的其他程序和数据。上述计算机可读存储介质还可以用于暂时地存储已经输出或者将要输出的数据。The computer-readable storage medium may be an internal storage unit of the terminal described in any one of the foregoing embodiments, such as a hard disk or a memory of the terminal. The computer-readable storage medium may also be an external storage device of the terminal, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) card, and a flash memory card provided on the terminal. (Flash Card), etc. Further, the computer-readable storage medium may further include both an internal storage unit of the terminal and an external storage device. The computer-readable storage medium is used to store the computer program and other programs and data required by the terminal. The computer-readable storage medium described above may also be used to temporarily store data that has been or will be output.
图5是本申请实施例提供的一种服务器结构示意图,该服务器500可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上中央处理器(central processing units,CPU)522(例如,一个或一个以上处理器)和存储器532,一个或一个以上存储应用程序542或数据544的存储介质530(例如一个或一个以上海量存储设备)。其中,存储器532和存储介质530可以是短暂存储或持久存储。存储在存储介质530的程序可以包括一个或一个以上模块(图 示没标出),每个模块可以包括对服务器中的一系列指令操作。更进一步地,中央处理器522可以设置为与存储介质530通信,在服务器500上执行存储介质530中的一系列指令操作。FIG. 5 is a schematic diagram of a server structure provided by an embodiment of the present application. The server 500 may have a large difference due to different configurations or performance, and may include one or more central processing units (CPUs) 522 (for example, , One or more processors) and a memory 532, one or more storage media 530 (eg, one or more storage devices) storing application programs 542 or data 544. The memory 532 and the storage medium 530 may be temporary storage or persistent storage. The program stored in the storage medium 530 may include one or more modules (not shown), and each module may include a series of instruction operations on the server. Furthermore, the central processing unit 522 may be configured to communicate with the storage medium 530, and execute a series of instruction operations in the storage medium 530 on the server 500.
服务器500还可以包括一个或一个以上电源526,一个或一个以上有线或无线网络接口550,一个或一个以上输入输出接口558,和/或,一个或一个以上操作系统541,例如Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTM等等。The server 500 may also include one or more power sources 526, one or more wired or wireless network interfaces 550, one or more input-output interfaces 558, and / or, one or more operating systems 541, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM and more.
上述实施例中由服务器所执行的步骤可以基于该图5所示的服务器结构。The steps performed by the server in the above embodiment may be based on the server structure shown in FIG. 5.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those of ordinary skill in the art may realize that the units and algorithm steps of each example described in combination with the embodiments disclosed herein can be implemented by electronic hardware, computer software, or a combination of the two. In order to clearly illustrate the hardware and software, Interchangeability. In the above description, the composition and steps of each example have been described generally in terms of functions. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Professional technicians can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of this application.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的系统、服务器、终端设备和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and brevity of the description, the specific working processes of the system, server, terminal device, and unit described above can refer to the corresponding processes in the foregoing method embodiments, and are not repeated here .
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、服务器和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,上述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口、装置或单元的间接耦合或通信连接,也可以是电的,机械的或其它的形式连接。In the several embodiments provided in this application, it should be understood that the disclosed systems, servers, and methods may be implemented in other ways. For example, the device embodiments described above are merely schematic. For example, the division of the above units is only a logical function division. In actual implementation, there may be another division manner. For example, multiple units or components may be combined or may be combined. Integration into another system, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may also be electrical, mechanical or other forms of connection.
上述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本申请实施例方案的目的。The units described above as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, which may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions in the embodiments of the present application.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中, 也可以是各个单元单独物理存在,也可以是两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist separately physically, or two or more units may be integrated into one unit. The above integrated unit may be implemented in the form of hardware or in the form of software functional unit.
上述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分,或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例上述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。When the above integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application is essentially a part that contributes to the existing technology, or all or part of the technical solution may be embodied in the form of a software product, which is stored in a storage medium Included are several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the above method in each embodiment of the present application. The foregoing storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program codes .
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。The above is only a specific implementation of this application, but the scope of protection of this application is not limited to this. Any person skilled in the art can easily think of various equivalents within the technical scope disclosed in this application. Modifications or replacements, and these modifications or replacements should be covered by the protection scope of this application. Therefore, the protection scope of this application shall be subject to the protection scope of the claims.

Claims (20)

  1. 一种用户行为数据推荐方法,其特征在于,包括:A method for recommending user behavior data includes:
    读取用户行为信息,对所述用户行为信息进行数据清洗以生成用户行为数据;Read user behavior information, and perform data cleaning on the user behavior information to generate user behavior data;
    对所述用户行为数据进行内容识别处理以形成规则库,所述规则库用于将上述用户行为数据进行分类存储;Performing content recognition processing on the user behavior data to form a rule base, where the rule base is used to classify and store the user behavior data;
    根据前端推荐系统的条件从所述规则库中提取推荐数据,将所述推荐数据推送到所述前端推荐系统。Extract recommendation data from the rule base according to the conditions of the front-end recommendation system, and push the recommendation data to the front-end recommendation system.
  2. 根据权利要求1所述的方法,其特征在于,在所述读取用户行为信息之前,所述方法还包括:The method according to claim 1, wherein before the reading user behavior information, the method further comprises:
    从日志文件中获取所述用户行为信息,或者从用户终端直接获取所述用户行为信息;Acquiring the user behavior information from a log file, or directly acquiring the user behavior information from a user terminal;
    将所述用户行为信息发送到消息队列中。And sending the user behavior information to a message queue.
  3. 根据权利要求2所述的方法,其特征在于,所述数据清洗包括:错误数据清洗、缺失值数据清洗、重复值数据清洗或不一致性数据清洗。The method according to claim 2, wherein the data cleaning comprises: erroneous data cleaning, missing value data cleaning, repeated value data cleaning or inconsistent data cleaning.
  4. 根据权利要求3所述的方法,其特征在于,所述读取用户行为信息包括:The method according to claim 3, wherein the reading user behavior information comprises:
    通过实时处理框架的数据获取组件从所述消息队列中读取所述用户行为信息;Reading the user behavior information from the message queue by a data acquisition component of the real-time processing framework;
    所述对所述用户行为信息进行数据清洗包括:The data cleaning of the user behavior information includes:
    将所述用户行为信息分发给所述实时处理框架的处理组件;Distributing the user behavior information to a processing component of the real-time processing framework;
    所述实时处理框架的处理组件对所述用户行为信息进行数据清洗得到用户行为数据。The processing component of the real-time processing framework performs data cleaning on the user behavior information to obtain user behavior data.
  5. 根据权利要求3所述的方法,其特征在于,所述实时处理框架可以为Storm框架;The method according to claim 3, wherein the real-time processing framework is a Storm framework;
    所述通过实时处理框架的数据获取组件从所述消息队列中读取所述用户行为信息,包括:The reading, by the data acquisition component of the real-time processing framework, the user behavior information from the message queue includes:
    通过Storm框架的spout组件从上述消息队列中拉取所述用户行为信息;Pull the user behavior information from the message queue through the spout component of the Storm framework;
    所述将所述用户行为信息分发给所述实时处理框架的处理组件,包括:The processing component that distributes the user behavior information to the real-time processing framework includes:
    将所述用户行为信息分发给Storm框架的Bolt组件。Distribute the user behavior information to the Bolt component of the Storm framework.
  6. 根据权利要求1-5任一项所述的方法,其特征在于,所述对所述用户行为数据进行内容识别处理形成规则库包括:The method according to any one of claims 1-5, wherein the performing a content recognition process on the user behavior data to form a rule base comprises:
    在所述用户行为数据进行内容识别处理后,将所述用户行为数据按照浏览行为、点击行为、输入行为或搜索行为进行分类;After the user behavior data is subjected to content recognition processing, classifying the user behavior data according to browsing behavior, click behavior, input behavior, or search behavior;
    将分类后的所述用户行为数据结构化处理;Structure the classified user behavior data;
    将结构化处理的所述用户行为数据进行存储形成所述规则库。The structured processing of the user behavior data is stored to form the rule base.
  7. 根据权利要求1-5任一项所述的方法,其特征在于,所述从所述规则库中提取符合条件的推荐数据,将所述推荐数据推送到前端推荐系统,包括:The method according to any one of claims 1-5, wherein the extracting qualified recommendation data from the rule base and pushing the recommendation data to a front-end recommendation system comprises:
    利用轨迹增强算法从所述规则库中按照不同的前端推荐系统的条件从上述规则库中提取推荐数据;Using a trajectory enhancement algorithm to extract recommendation data from the rule base from the rule base according to the conditions of different front-end recommendation systems;
    将所述推荐数据推送到对应的前端推荐系统。The recommendation data is pushed to a corresponding front-end recommendation system.
  8. 根据权利要求7所述的方法,其特征在于,所述前端推荐系统的条件可以包括将相同的URL合并到一起,将访问的上行流量与下行流量合并并进行排序,然后网址按流量汇总排序出来,取总流量的在前预设百分比的网址。The method according to claim 7, wherein the conditions of the front-end recommendation system include merging the same URLs together, merging and sorting the upstream and downstream traffic accessed, and then sorting the URLs based on the traffic summary , Take the URL of the previous preset percentage of total traffic.
  9. 根据权利要求8所述的方法,其特征在于,在所述将所述推荐数据推送到对应的前端推荐系统之后,所述方法还包括:The method according to claim 8, wherein after the pushing the recommendation data to a corresponding front-end recommendation system, the method further comprises:
    将所述推荐数据存储到历史行为轨迹数据库中;Storing the recommendation data in a historical behavior trajectory database;
    将所述规则库中的所述推荐数据删除。Delete the recommendation data in the rule base.
  10. 一种服务器,其特征在于,包括:A server is characterized in that it includes:
    数据清洗单元,用于读取用户行为信息,对上述用户行为信息进行数据清洗生成用户行为数据;A data cleaning unit, configured to read user behavior information, and perform data cleaning on the user behavior information to generate user behavior data;
    规则库生成单元,用于对上述用户行为数据进行内容识别处理形成规则库,上述规则库用于将上述用户行为数据进行分类存储;A rule base generating unit, configured to perform content recognition processing on the user behavior data to form a rule base, and the rule base is used to classify and store the user behavior data;
    提取单元,用于根据前端推荐系统的条件从上述规则库中提取符合条件推荐数据;An extraction unit, configured to extract qualified recommendation data from the above rule base according to the conditions of the front-end recommendation system;
    推荐单元,用于将上述推荐数据推送到前端推荐系统。A recommendation unit is configured to push the above recommendation data to a front-end recommendation system.
  11. 根据权利要求10所述的服务器,其特征在于,所述服务器还包括:The server according to claim 10, wherein the server further comprises:
    获取单元,用于从日志文件中获取用户行为信息,或者从用户终端直接获取上述用户行为信息;An obtaining unit, configured to obtain user behavior information from a log file, or directly obtain the user behavior information from a user terminal;
    发送单元,用于将上述用户行为信息发送到消息队列中。A sending unit is configured to send the user behavior information to a message queue.
  12. 根据权利要求11所述的服务器,其特征在于,所述数据清洗包括:错误数据清洗、缺失值数据清洗、重复值数据清洗或不一致性数据清洗。The server according to claim 11, wherein the data cleaning comprises: erroneous data cleaning, missing value data cleaning, repeated value data cleaning or inconsistent data cleaning.
  13. 根据权利要求12所述的服务器,其特征在于,上述数据清洗单元包括:The server according to claim 12, wherein the data cleaning unit comprises:
    读取单元,用于通过实时处理框架的数据获取组件从上述消息队列中读取上述用户行为信息,然后将上述用户行为信息分发给处理组件;A reading unit, configured to read the user behavior information from the message queue through a data acquisition component of the real-time processing framework, and then distribute the user behavior information to the processing component;
    分发单元,用于将上述用户行为信息分发给所述实时处理框架的处理组件;A distribution unit, configured to distribute the user behavior information to a processing component of the real-time processing framework;
    清洗单元,用于所述实时处理框架的处理组件对所述用户行为信息进行数据清洗得到用户行为数据。The cleaning unit is configured to process data of the user behavior information by a processing component of the real-time processing framework to obtain user behavior data.
  14. 根据权利要求13所述的服务器,其特征在于,所述实时处理框架可以为Storm框架;The server according to claim 13, wherein the real-time processing framework is a Storm framework;
    所述读取单元,用于通过Storm框架的spout组件从上述消息队列中拉取所述用户行为信息;The reading unit is configured to pull the user behavior information from the message queue through the spout component of the Storm framework;
    所述分发单元,用于将所述用户行为信息分发给Storm框架的Bolt组件。The distribution unit is configured to distribute the user behavior information to a Bolt component of a Storm framework.
  15. 根据权利要求10-14任一项所述的服务器,其特征在于,上述规则库生成单元包括:The server according to any one of claims 10 to 14, wherein the rule base generating unit comprises:
    分类单元,用于在上述用户行为数据进行内容识别处理后,将上述用户行为数据按照浏览行为、点击行为、输入行为或搜索行为进行分类;A classification unit, configured to classify the user behavior data according to browsing behavior, click behavior, input behavior, or search behavior after performing content identification processing on the user behavior data;
    处理单元,用于将分类后的上述用户行为数据结构化处理;A processing unit for structuredly processing the classified user behavior data;
    第一存储单元,将结构化处理的上述用户行为数据进行存储形成上述规则库。The first storage unit stores the structured processing of the user behavior data to form the rule base.
  16. 根据权利要求10-14任一项所述的服务器,其特征在于,所述提取单元,用于利用轨迹增强算法从所述规则库中按照不同的前端推荐系统的条件从上述规则库中提取推荐数据。The server according to any one of claims 10 to 14, wherein the extraction unit is configured to extract a recommendation from the rule base according to the conditions of different front-end recommendation systems from the rule base by using a trajectory enhancement algorithm. data.
  17. 根据权利要求16所述的服务器,其特征在于,所述前端推荐系统的条件可以包括将相同的URL合并到一起,将访问的上行流量与下行流量合并并进行排序,然后网址按流量汇总排序出来,取总流量的在前预设百分比的网址。The server according to claim 16, wherein the conditions of the front-end recommendation system include merging the same URLs together, merging and sorting the upstream and downstream traffic accessed, and then sorting the URLs based on the traffic summary , Take the URL of the previous preset percentage of total traffic.
  18. 根据权利要求17所述的服务器,其特征在于,上述服务器还包括:The server according to claim 17, wherein the server further comprises:
    第二存储单元370,用于将上述推荐数据存储到历史行为轨迹数据库中;A second storage unit 370, configured to store the above-mentioned recommendation data in a historical behavior trajectory database;
    删除单元380,用于将上述规则库中的上述推荐数据删除。The deleting unit 380 is configured to delete the recommended data in the rule base.
  19. 一种服务器,其特征在于,所述服务器包括处理器、存储器和通信模块,其中,所述存储器用于存储程序代码,所述处理器用于调用所述程序代码来执行如权利要求1-9任一项所述的方法。A server, characterized in that the server includes a processor, a memory, and a communication module, wherein the memory is used to store program code, and the processor is used to call the program code to execute any one of claims 1-9 The method of one item.
  20. 一种计算机可读存储介质,其特征在于,所述计算机存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行如权利要求1-9任一项所述的方法。A computer-readable storage medium, characterized in that the computer storage medium stores a computer program, wherein the computer program includes program instructions, and the program instructions, when executed by a processor, cause the processor to execute the program according to claim 1 The method according to any of -9.
PCT/CN2018/123508 2018-08-22 2018-12-25 User behavior data recommendation method, server and computer readable medium WO2020037917A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810965582.5A CN109242553A (en) 2018-08-22 2018-08-22 A kind of user behavior data recommended method, server and computer-readable medium
CN201810965582.5 2018-08-22

Publications (1)

Publication Number Publication Date
WO2020037917A1 true WO2020037917A1 (en) 2020-02-27

Family

ID=65069108

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/123508 WO2020037917A1 (en) 2018-08-22 2018-12-25 User behavior data recommendation method, server and computer readable medium

Country Status (2)

Country Link
CN (1) CN109242553A (en)
WO (1) WO2020037917A1 (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111127077A (en) * 2019-11-29 2020-05-08 中国建设银行股份有限公司 Recommendation method and device based on stream computing
CN113032587B (en) * 2019-12-25 2023-07-28 北京达佳互联信息技术有限公司 Multimedia information recommendation method, system, device, terminal and server
CN111274278A (en) * 2020-01-19 2020-06-12 托普朗宁(北京)教育科技有限公司 Method and device for assisting learning and readable storage medium
CN111427878B (en) * 2020-03-20 2024-02-27 深圳乐信软件技术有限公司 Data monitoring alarm method, device, server and storage medium
CN111753214A (en) * 2020-06-24 2020-10-09 平安科技(深圳)有限公司 Data pushing method and system based on behavior track and computer equipment
CN112579902A (en) * 2020-12-24 2021-03-30 第四范式(北京)技术有限公司 Behavior data management method and device supporting multiple intelligent application scenes
CN112925815B (en) * 2021-02-23 2023-08-08 四川享宇金信金融科技有限公司 Push information automatic generation system with tracking function
CN113626539A (en) * 2021-08-13 2021-11-09 深圳墨世科技有限公司 User behavior data statistical method, server and client
CN113938919B (en) * 2021-09-03 2023-07-07 中国联合网络通信集团有限公司 Data analysis method and device
CN115186770B (en) * 2022-09-08 2023-04-25 北京邮电大学 Driver identity recognition method and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103810623A (en) * 2014-03-04 2014-05-21 深圳市远行科技有限公司 Real-time automatic marketing method and system
CN105468737A (en) * 2015-11-24 2016-04-06 湖北大学 Web service big data analysis method, cloud computing platform and mining system
CN106874522A (en) * 2017-03-29 2017-06-20 珠海习悦信息技术有限公司 Information recommendation method, device, storage medium and processor
CN107451269A (en) * 2017-07-28 2017-12-08 佛山市南方数据科学研究院 A kind of user behavior analysis method based on big data
CN107944059A (en) * 2017-12-29 2018-04-20 深圳市中润四方信息技术有限公司西安分公司 A kind of user behavior analysis method and system based on stream calculation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103810623A (en) * 2014-03-04 2014-05-21 深圳市远行科技有限公司 Real-time automatic marketing method and system
CN105468737A (en) * 2015-11-24 2016-04-06 湖北大学 Web service big data analysis method, cloud computing platform and mining system
CN106874522A (en) * 2017-03-29 2017-06-20 珠海习悦信息技术有限公司 Information recommendation method, device, storage medium and processor
CN107451269A (en) * 2017-07-28 2017-12-08 佛山市南方数据科学研究院 A kind of user behavior analysis method based on big data
CN107944059A (en) * 2017-12-29 2018-04-20 深圳市中润四方信息技术有限公司西安分公司 A kind of user behavior analysis method and system based on stream calculation

Also Published As

Publication number Publication date
CN109242553A (en) 2019-01-18

Similar Documents

Publication Publication Date Title
WO2020037917A1 (en) User behavior data recommendation method, server and computer readable medium
CN106557513B (en) Event information pushing method and event information pushing device
US10706094B2 (en) System and method for customizing a display of a user device based on multimedia content element signatures
WO2017084521A1 (en) Order clustering method and device, and malicious information rejecting method and device
US10725981B1 (en) Analyzing big data
CN103620601B (en) Joining tables in a mapreduce procedure
US8843429B2 (en) Action prediction and identification of user behavior
US10848590B2 (en) System and method for determining a contextual insight and providing recommendations based thereon
CN114840486B (en) User behavior data acquisition method and system and cloud platform
JP2013519941A (en) Method and system for e-commerce transaction data accounting
CN106991175B (en) Customer information mining method, device, equipment and storage medium
WO2019001429A1 (en) Multisource data fusion method and apparatus
TWI705411B (en) Method and device for identifying users with social business characteristics
WO2018205845A1 (en) Data processing method, server, and computer storage medium
CN111666492A (en) Information pushing method, device and equipment based on user behaviors and storage medium
US20160267498A1 (en) Systems and methods for identifying new users using trend analysis
CN111444424A (en) Information recommendation method and information recommendation system
CN111447081A (en) Data chain generation method, device, server and storage medium
CN110674404A (en) Link information generation method, device, system, storage medium and electronic equipment
US11620327B2 (en) System and method for determining a contextual insight and generating an interface with recommendations based thereon
CN110196849A (en) It realizes that user draws a portrait based on big data Treatment process and constructs the system and method for processing
WO2021114634A1 (en) Text annotation method, device, and storage medium
US20150032749A1 (en) Method of creating classification pattern, apparatus, and recording medium
Aggarwal Identification of quality parameters associated with 3V's of Big Data
CN106549914A (en) A kind of recognition methodss of independent access person and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18931274

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18931274

Country of ref document: EP

Kind code of ref document: A1