CN113761102A - Data processing method, device, server, system and storage medium - Google Patents

Data processing method, device, server, system and storage medium Download PDF

Info

Publication number
CN113761102A
CN113761102A CN202011298991.8A CN202011298991A CN113761102A CN 113761102 A CN113761102 A CN 113761102A CN 202011298991 A CN202011298991 A CN 202011298991A CN 113761102 A CN113761102 A CN 113761102A
Authority
CN
China
Prior art keywords
user
data
attribute parameter
parameter
attribute
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011298991.8A
Other languages
Chinese (zh)
Inventor
杜标棋
申作军
王琼萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Shangke Information Technology Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Original Assignee
Beijing Jingdong Shangke Information Technology Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Shangke Information Technology Co Ltd, Beijing Wodong Tianjun Information Technology Co Ltd filed Critical Beijing Jingdong Shangke Information Technology Co Ltd
Priority to CN202011298991.8A priority Critical patent/CN113761102A/en
Publication of CN113761102A publication Critical patent/CN113761102A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • G06F16/319Inverted lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • G06F16/325Hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Abstract

The embodiment of the invention discloses a data processing method, a device, a server, a system and a storage medium, wherein the data processing method comprises the following steps: receiving a query request of a terminal, wherein the query request comprises a user attribute parameter and a behavior attribute parameter, the user attribute parameter is a parameter corresponding to user basic information, and the behavior attribute parameter is a parameter corresponding to network operation information; mapping the inverted index of the user identification through the pre-established attribute parameters, acquiring the user identification matched with the query request, and creating a target crowd packet according to the acquired user identification; and sending the target crowd packet to the terminal. In the embodiment of the invention, the target crowd can be selected based on the parameters corresponding to the user basic information and the parameters corresponding to the network operation information, so that the crowd selection basis is expanded, and refined crowd selection can be realized; in addition, the target crowd is selected based on the pre-established reverse index, so that the selection efficiency is improved, and the response speed of the terminal side is improved.

Description

Data processing method, device, server, system and storage medium
Technical Field
The present invention relates to communications technologies, and in particular, to a data processing method, apparatus, server, system, and storage medium.
Background
For the goods supplier, before the new goods are released, the demand investigation is often needed, the investigation work can be performed aiming at specific people, and at the moment, the target people are needed to be selected as the investigation and release objects. In the process of implementing the present invention, the inventor finds that the currently commonly used method for selecting the target population is generally based on the basic information of the user, for example, based on the gender, age, academic calendar and the like of the user, the selected target population is not fine enough, and the problem of low selection efficiency exists.
Disclosure of Invention
Embodiments of the present invention provide a data processing method, apparatus, server, system, and storage medium, which can implement refined crowd selection and improve selection efficiency.
In a first aspect, an embodiment of the present invention provides a data processing method, including:
receiving a query request of a terminal, wherein the query request comprises a user attribute parameter and a behavior attribute parameter, the user attribute parameter is a parameter corresponding to user basic information, and the behavior attribute parameter is a parameter corresponding to network operation information;
mapping an inverted index of the user identification through a pre-established attribute parameter, acquiring the user identification matched with the query request, and creating a target crowd packet according to the acquired user identification;
and sending the target crowd packet to the terminal.
In a second aspect, an embodiment of the present invention provides a data processing apparatus, including:
the terminal comprises a receiving module, a sending module and a processing module, wherein the receiving module is used for receiving a query request of the terminal, the query request comprises a user attribute parameter and a behavior attribute parameter, the user attribute parameter is a parameter corresponding to user basic information, and the behavior attribute parameter is a parameter corresponding to network operation information;
the creating module is used for mapping the inverted index of the user identification through the pre-established attribute parameters, acquiring the user identification matched with the query request, and creating a target crowd packet according to the acquired user identification;
and the sending module is used for sending the target crowd packet to the terminal.
In a third aspect, an embodiment of the present invention provides a server, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the data processing method according to the first aspect when executing the program.
In a fourth aspect, an embodiment of the present invention provides a data processing system, including a terminal and a server, configured to execute the data processing method according to the first aspect of the embodiment of the present invention.
In a fifth aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the data processing method according to the first aspect of the embodiment of the present invention.
In the embodiment of the invention, a terminal can be supported to send a query request carrying user attribute parameters and behavior attribute parameters, the user attribute parameters are parameters corresponding to user basic information, the behavior attribute parameters are parameters corresponding to network operation information, after the query request of the terminal is received, the inverted index of the user identification can be mapped through the pre-established attribute parameters, the user identification matched with the query request is obtained, a target crowd packet is created, and the target crowd packet is sent to the terminal. In the embodiment of the invention, the target crowd can be selected based on the parameters corresponding to the user basic information and the parameters corresponding to the network operation information, so that the crowd selection basis is expanded, and refined crowd selection can be realized; in addition, the target crowd is selected based on the pre-established reverse index, so that the selection efficiency is improved, and the response speed of the terminal side is improved.
Drawings
Fig. 1 is a schematic flow chart of a data processing method according to an embodiment of the present invention.
Fig. 2 is a flowchart illustrating a method for creating an inverted index according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of a data storage method according to an embodiment of the present invention.
Fig. 4 is another schematic diagram of a data storage method according to an embodiment of the present invention.
Fig. 5 is a schematic flowchart of a data export method according to an embodiment of the present invention.
FIG. 6 is a schematic flow chart of a crowd selection method according to an embodiment of the present invention
Fig. 7 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention.
Fig. 8 is a schematic structural diagram of a server according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
The existing crowd selection method is only based on the basic information of the user, so that the selected crowd is not fine enough, and the problem of low selection efficiency exists. Therefore, the embodiment of the invention provides a data processing method, which can realize refined crowd selection and improve the selection efficiency.
Fig. 1 is a schematic flowchart of a data processing method according to an embodiment of the present invention, where the method may be executed by an information generating apparatus according to an embodiment of the present invention, and the apparatus may be implemented in software and/or hardware. In a specific embodiment, the apparatus may be integrated in a server, and the following embodiments will be described by taking the example of the apparatus being integrated in the server. As shown in fig. 1, the method comprises the steps of:
step 101, receiving a query request of a terminal, where the query request includes a user attribute parameter and a behavior attribute parameter, where the user attribute parameter is a parameter corresponding to user basic information, and the behavior attribute parameter is a parameter corresponding to network operation information.
For example, when a target population needs to be selected as a research and delivery object for a new product, related personnel can set a tag related to the new product through a preset interface of a terminal, the tag is a user attribute parameter and a behavior attribute parameter, the terminal can acquire the set user attribute parameter and the set behavior attribute parameter through the preset interface, the user attribute parameter and the set behavior attribute parameter are packaged in a query request and sent to a server, and the server receives the query request sent by the terminal. The user attribute parameters are parameters corresponding to user basic information, such as gender, age, academic history, occupation, marital status, city and the like of the user, and the corresponding user attribute parameters include: male, 30 years old, this family, programmers, marriage and Shenzhen; the behavior attribute parameter is a parameter corresponding to the network operation information, the network operation information may include operation type information and article attribute information associated with the operation type information, such as a brand, a type, a name, a code number, and the like of the article, and the corresponding behavior attribute parameter is, for example: browse/click/collect/join shopping cart/buy, cell phone/tv/rice cooker, etc.
For example, if a manufacturer of a certain brand of mobile phone wants to release a new mobile phone, and needs to perform a crowd research, the terminal may set tags such as gender, age, academic calendar, brand, category, and network operation behavior on a preset interface of the terminal according to actual conditions, and the terminal sends a query request to the server according to the setting, where the query request may be for querying users having certain user characteristics and behavior characteristics, for example, querying users who have browsed the brand of mobile phone and have sex as a male.
And 102, mapping the inverted index of the user identification through the pre-established attribute parameters, acquiring the user identification matched with the query request, and creating a target crowd packet according to the acquired user identification.
In a specific implementation, the attribute parameters in the inverted index may include a user attribute parameter and a behavior attribute parameter, and the user identifier may be a PIN code of the user, a name and a code number of the user, and the like, which are used to uniquely identify the user.
For example, registration data of the users in the whole network may be collected, historical operation records of the users in the whole network may be collected according to a preset data collection period, where the preset data collection period is, for example, one day, one week, ten days, and the like, parameters (i.e., user attribute parameters) corresponding to the basic information of the users are extracted from the collected registration data, parameters (i.e., behavior attribute parameters) corresponding to the network operation information are extracted from the collected historical operation records, and the user identifier, the user attribute parameters, and the behavior attribute parameters constitute user data.
After obtaining the user data of the network-wide users, the user data may be stored, and a user data table may be established according to the stored user data, for example, the user identifier may be used as a primary key, and the user attribute parameter and the behavior attribute parameter may be used as a line content to establish the user data table, that is, each line in the user data table stores the user data of one user. After the user data table is established, an inverted index of the attribute parameter mapping user identification can be established based on the established user data table, when a query request sent by a terminal is received, the user identification matched with the query request including the user attribute parameters and the behavior attribute parameters can be obtained according to the established inverted index, and a target crowd packet is established according to the obtained user identification.
And 103, sending the target crowd packet to the terminal.
The target crowd package comprises the user identification of the target crowd, and the terminal side can display the target crowd package so that related personnel can develop new product investigation according to the target crowd package.
In the embodiment of the invention, a terminal can be supported to send a query request carrying user attribute parameters and behavior attribute parameters, the user attribute parameters are parameters corresponding to user basic information, the behavior attribute parameters are parameters corresponding to network operation information, after the query request of the terminal is received, the inverted index of the user identification can be mapped through the pre-established attribute parameters, the user identification matched with the query request is obtained, a target crowd packet is created, and the target crowd packet is sent to the terminal. In the embodiment of the invention, the target crowd can be selected based on the parameters corresponding to the user basic information and the parameters corresponding to the network operation information, so that the crowd selection basis is expanded, and refined crowd selection can be realized; in addition, the target crowd is selected based on the pre-established reverse index, so that the selection efficiency is improved, and the response speed of the terminal side is improved.
In a specific embodiment, as shown in fig. 2, the inverted index of the attribute parameter mapping user identifier may be established as follows:
step 201, according to the stored user data of all users, a user data table with the user identification as the main key and the user attribute parameters and behavior attribute parameters as the row content is created.
Illustratively, the stored user data of all users may include user identification, user attribute parameters, and behavior attribute parameters. The user identifier may be a PIN code of the user, a name, a code number and the like of the user, and is used for uniquely identifying the user, and the user identifier may be obtained from registration information of the user or may be created for the user by the server; the user attribute parameters are parameters corresponding to user basic information, such as gender, age, academic history, occupation, marital conditions, city and the like of the user, and can be acquired from registration information of the user; the behavior attribute parameter is a parameter corresponding to the network operation information, the network operation information may include operation type information and article attribute information associated with the operation type information, such as a brand, a type, a name, a code number, and the like of the article, and the behavior attribute parameter may be obtained through a historical operation record of the user.
After the user data of all the users are acquired, all the user data can be stored on a plurality of preset storage nodes in a distributed storage mode, so that the data export speed is improved in the subsequent data export process. For example, the hash operation may be performed on the user identifier of each of all users to obtain a hash value of the user identifier of each user, a storage node of the user data of each user is determined according to the hash value of the user identifier of each user, and the user data of each user is stored in the corresponding storage node.
For example, a plurality of storage nodes may be numbered in advance, hash modulo operation may be performed on the user identifier of each user according to the number of the storage nodes to obtain a hash modulo value of each user identifier, when storing user data of a certain user, the storage nodes with numbers consistent with the hash modulo value of the user identifier of the user are searched, and the user data of the user is stored in the searched storage nodes.
In the specific implementation, because the behavior attribute parameters of the user are acquired through the collected historical operation records of the user, and the historical operation records of the user are continuously generated along with the advance of time, in order to reduce the stored data volume and improve the data storage speed, the behavior attribute parameters of the user can be stored in an incremental storage mode. For example, for a certain user, when data is stored in the current data acquisition cycle, incremental data of the user may be determined first, where the incremental data is difference data between the behavior attribute parameter of the user acquired up to the current data acquisition cycle and the behavior attribute parameter of the user acquired up to the previous data acquisition cycle, and only the incremental data is stored in the current data acquisition cycle.
For example, the preset data acquisition period is one day, that is, the behavior attribute parameters are written into the storage nodes once a day, and the current day is the third day when the behavior attribute parameters are written into the storage nodes, for a certain user, when the behavior attribute parameters are written into the corresponding storage nodes on the third day, difference data between the behavior attribute parameters of the user obtained on the last three days and the behavior attribute parameters of the user obtained on the last two days can be determined, and when data is written into the corresponding storage nodes on the third day, only the difference data is written into the corresponding storage nodes.
A specific example is shown in fig. 3, where the behavior attribute parameter (for example, the searched brand) of a certain user collected on the first day is 123, the behavior attribute parameter of the user collected on the last two days is 123, 456, and the behavior attribute parameter of the user collected on the last three days is 123, 456, 789, then 123 is written into the corresponding storage node when data is stored on the first day, only 456 is stored into the corresponding storage node when data is stored on the second day, and only 789 is stored into the corresponding storage node when data is stored on the third day. If the behavior attribute parameters of the user stored in the last three days are to be derived, the three stored data can be merged and derived. According to the above-described storage method, user information of a specific user can be stored as shown in fig. 4.
When creating a user data table from stored user data, the following method may be followed:
(1) and setting a data type, wherein the data type of the user attribute parameter is an accurate value type, and the data type of the behavior attribute parameter is a full text type.
The data of the accurate value type does not need to be participled when the reverse index is established, and the data of the full text type needs to be participled when the reverse index is established.
(2) And creating a user data table which takes the user identification as a main key and takes the user attribute parameters and the behavior attribute parameters as row contents according to the set data type.
I.e. each row of the user data table, stores user data for one user.
In addition, when creating the user data table, fields included in the table may be set according to specific data included in the user data, for example, fields included in the table may be set according to a preset field naming rule.
Preset field naming rules, such as:
the operation type is as follows: s denotes search, v denotes browse, b denotes purchase, and f denotes focus.
Time (days): 15, 30, 60, 90, 120, 150, 180.
The article attributes are: c denotes the category and b denotes the brand.
Fields corresponding to the behavior attribute parameters, such as fields s15b, b15c, v20b, etc., may be set in the user data table according to the above naming rules, where s15b indicates the brand searched for in the last 15 days, the name of the brand searched for in the last 15 days may be stored in the s15b field, b15c indicates the brand purchased in the last 15 days, the name of the brand purchased in the last 15 days may be stored in the b15c field, v20b indicates the brand browsed in the last 20 days, and the name of the brand browsed in the last 20 days may be stored in the v20b field. In addition, fields corresponding to the user attribute parameters, such as gender, age, school calendar, etc., may also be set in the user data table.
In a specific implementation, in an ES (elastic search) index library, a user data table may be created according to user data of all stored users, and then a process of creating the user data table is equivalent to creating an ES index, and then fields, setting data types, and the like included in the table may be set, and the creation of a specific mapping file may be implemented by creating a mapping file, where the creation code of the specific mapping file may be as follows:
Figure BDA0002786273040000091
Figure BDA0002786273040000101
step 202, establishing an inverted index of the attribute parameter mapping user identification according to the user data table.
Specifically, word segmentation can be performed on the behavior attribute parameters in the user data table, and an attribute parameter set is obtained according to word segmentation results and the user attribute parameters; and aiming at any attribute parameter in the attribute parameter set, searching a row containing the attribute parameter in a user data table, adding a user identifier corresponding to the searched row into a user identifier set mapped by the attribute parameter, and traversing each attribute parameter in the attribute parameter set to obtain an inverted index of the attribute parameter mapped user identifier.
In a specific embodiment, the data packet may be further exported as needed, and a specific export method may be as shown in fig. 5, and includes the following steps:
step 301, receiving a data packet export request sent by a terminal aiming at a target crowd packet.
Step 302, performing hash operation on each user identifier included in the target crowd packet to obtain a hash value of each user identifier.
Step 303, dividing the user identifiers included in the target crowd packet into N groups according to the hash value of each user identifier, where N is an integer greater than 1.
For example, hash modulo operation may be performed on each user identifier included in the target crowd packet according to the number N of the called threads to obtain a hash modulo value of each user identifier, and the user identifiers included in the target crowd packet may be divided into N groups according to the hash modulo value of each user identifier.
At step 304, N threads are invoked to export user data for each user identified in the N packets.
Since the target crowd packet usually involves exporting a large amount of user data, the data exporting efficiency can be improved by exporting the data packet of the target crowd through a plurality of threads in parallel.
Further, the user data of the user identified by each user identifier in the N groups may be derived in a rolling query manner, so as to further improve data derivation efficiency.
For example, a threshold number of queries may be set, and each query for the threshold number of entries returns a query result, and identifies the next query location in the query result, and when the query is next time, the query is continued from the query location.
In a specific embodiment, the specified crowd packets may be selected according to the requirement, and the specific selection method may be as shown in fig. 6, and includes the following steps:
step 401, receiving a crowd selection request sent by a terminal aiming at a candidate crowd packet.
The candidate crowd package may include user identifications of the candidate users, and there may be a plurality of candidate users.
And 402, selecting the user identification in the candidate crowd package according to the inverted index and a preset rule to obtain a selected crowd package.
The preset rule is a crowd selection rule, and the crowd selection rule may relate to a plurality of attribute parameters, such as a first attribute parameter, a second attribute parameter, and a third attribute parameter, where the first attribute parameter, the second attribute parameter, and the third attribute parameter are different from each other.
For example, a first user identifier set mapped by a first attribute parameter, a second user identifier set mapped by a second attribute parameter, and a third user identifier set mapped by a third attribute parameter may be selected from the user identifiers of the candidate crowd package according to the inverted index; and calculating a union of the first user identification set and the second user identification set, calculating an intersection of the union and the third user identification set, and removing the intersection from the union to obtain a selected crowd packet.
For example, if the crowd rounding request is to select a user group with a potential demand for an item (for example, brand X) from the candidate crowd rounding, the first attribute parameter and the second attribute parameter may be any two of browsing, searching, collecting and paying attention to brand X, and the third attribute parameter may be purchasing brand X; the users in the first user identification set and the second user identification set may be users who browse, search, collect, or pay attention to brand X in the candidate population, the users in the third user identification set may be users who purchased brand X in the candidate population, and the users in the selected population group may be users who have a potential need for the brand X.
Fig. 7 is a block diagram of a data processing apparatus according to an embodiment of the present invention, and as shown in fig. 7, the apparatus includes:
a receiving module 501, configured to receive a query request of a terminal, where the query request includes a user attribute parameter and a behavior attribute parameter, where the user attribute parameter is a parameter corresponding to user basic information, and the behavior attribute parameter is a parameter corresponding to network operation information;
a creating module 502, configured to map an inverted index of the user identifier through a pre-established attribute parameter, obtain a user identifier matched with the query request, and create a target crowd packet according to the obtained user identifier;
a sending module 503, configured to send the target crowd packet to the terminal.
In one embodiment, the apparatus is further configured to establish the inverted index by:
according to the stored user data of all users, a user data table which takes the user identification as a main key and takes the user attribute parameters and the behavior attribute parameters as row contents is created, and the user data of each user comprises the user identification, the user attribute parameters and the behavior attribute parameters of the corresponding user;
and establishing an inverted index of the attribute parameter mapping user identification according to the user data table.
In one embodiment, the apparatus is further configured to store the user data of all users as follows:
performing hash operation on the user identifier of each user in all the users to obtain a hash value of the user identifier of each user;
determining a storage node of user data of each user according to the hash value of the user identifier of each user;
and storing the user data of each user on the corresponding storage node.
In an embodiment, the apparatus is further configured to store the behavior attribute parameter of any one of the users as follows:
determining incremental data of any user, wherein the incremental data is difference data between the behavior attribute parameter of any user acquired by the current data acquisition cycle and the behavior attribute parameter of any user acquired by the previous data acquisition cycle;
and storing the incremental data of any user in the current data acquisition period.
In one embodiment, the apparatus is further configured to create a user data table, including:
setting a data type, wherein the data type of the user attribute parameter is an accurate value type, and the data type of the behavior attribute parameter is a full text type;
and creating a user data table which takes the user identification as a main key and takes the user attribute parameters and the behavior attribute parameters as row contents according to the set data type.
In an embodiment, the apparatus is further configured to establish an inverted index of the attribute parameter mapping user identifier as follows:
performing word segmentation on the behavior attribute parameters in the user data table, and obtaining an attribute parameter set according to word segmentation results and the user attribute parameters;
and aiming at any attribute parameter in the attribute parameter set, searching a row containing the any attribute parameter in the user data table, adding a user identifier corresponding to the searched row into a user identifier set mapped by the any attribute parameter, traversing each attribute parameter in the attribute parameter set, and obtaining an inverted index of the attribute parameter mapped user identifier.
In one embodiment, the apparatus is further configured to:
receiving a data packet export request sent by the terminal aiming at the target crowd packet;
performing hash operation on each user identifier contained in the target crowd packet to obtain a hash value of each user identifier;
dividing the user identifications contained in the target crowd packet into N groups according to the hash value of each user identification, wherein N is an integer greater than 1;
invoking the N threads derives user data for each user identification identified in the N packets.
In one embodiment, the apparatus is further configured to derive user data for each user identification identified in the N groups by:
and calling N threads to derive user data of the user identified by each user identifier in the N groups in a rolling query mode.
In one embodiment, the apparatus is further configured to:
receiving a crowd selection request sent by the terminal aiming at the candidate crowd packet;
and selecting the user identification in the candidate crowd packet according to the inverted index and a preset rule to obtain a selected crowd packet.
In one embodiment, the apparatus is further configured to obtain the selected crowd package by:
according to the inverted index, selecting a first user identifier set mapped by a first attribute parameter, a second user identifier set mapped by a second attribute parameter and a third user identifier set mapped by a third attribute parameter from the user identifiers of the candidate crowd package, wherein the first attribute parameter, the second attribute parameter and the third attribute parameter are different from each other;
and calculating a union of the first user identification set and the second user identification set, calculating an intersection of the union and the third user identification set, and removing the intersection from the union to obtain the selected crowd packet.
It is obvious to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be performed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to perform all or part of the above described functions. For the specific working process of the functional module, reference may be made to the corresponding process in the foregoing method embodiment, which is not described herein again.
The device of the embodiment of the invention can support the terminal to send the query request carrying the user attribute parameters and the behavior attribute parameters, wherein the user attribute parameters are parameters corresponding to the user basic information, the behavior attribute parameters are parameters corresponding to the network operation information, after the query request of the terminal is received, the inverted index of the user identification can be mapped through the pre-established attribute parameters, the user identification matched with the query request is obtained, the target crowd packet is created, and the target crowd packet is sent to the terminal. In the embodiment of the invention, the target crowd can be selected based on the parameters corresponding to the user basic information and the parameters corresponding to the network operation information, so that the crowd selection basis is expanded, and refined crowd selection can be realized; in addition, the target crowd is selected based on the pre-established reverse index, so that the selection efficiency is improved, and the response speed of the terminal side is improved.
The embodiment of the present invention further provides a data processing system, which includes a terminal and a server, where the server is configured to execute the data processing method described in the foregoing embodiment, and the specific processing and interaction process may refer to the description of the foregoing embodiment, and is not described herein again.
Fig. 8 is a block diagram of a server provided by an embodiment of the present invention, and fig. 8 shows a block diagram of an exemplary server 912 suitable for implementing an embodiment of the present invention. The server 912 shown in fig. 8 is only an example and should not bring any limitations to the function and the scope of use of the embodiments of the present invention.
As shown in FIG. 8, the server 912 is implemented as a general purpose computing device. Components of server 912 may include, but are not limited to: one or more processors or processing units 916, a system memory 928, and a bus 918 that couples the various system components (including the system memory 928 and the processing unit 916).
Bus 918 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
The server 912 typically includes a variety of computer system readable media. Such media can be any available media that is accessible by server 912 and includes both volatile and nonvolatile media, removable and non-removable media.
The system memory 928 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)930 and/or cache memory 932. The server 912 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 934 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 8, and typically referred to as a "hard disk drive"). Although not shown in FIG. 8, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to the bus 918 through one or more data media interfaces. Memory 928 may include at least one program product having a set (e.g., at least one) of program modules configured to carry out the functions of embodiments of the invention.
A program/utility 940 having a set (at least one) of program modules 942, which may include, but is not limited to, an operating system, one or more application programs, other program modules, and program data, each or some combination of these examples possibly including an implementation of a network environment, may be stored in, for example, the memory 928. The program modules 942 generally perform the functions and/or methodologies of the described embodiments of the invention.
The server 912 may also communicate with one or more external devices 914 (e.g., keyboard, pointing device, display 924, etc.), with one or more devices that enable a user to interact with the server 912, and/or with any devices (e.g., network card, modem, etc.) that enable the server 912 to communicate with one or more other computing devices. Such communication may occur via input/output (I/O) interfaces 922. Also, device 912 can communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the internet) through network adapter 920. As shown, the network adapter 920 communicates with the other modules of the device 912 via the bus 918. It should be appreciated that although not shown in FIG. 8, other hardware and/or software modules may be used in conjunction with the server 912, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
The processing unit 916 executes various functional applications and data processing by executing programs stored in the system memory 928, for example, implementing an information generating method provided by an embodiment of the present invention:
that is, the processing unit 916 implements, when executing the program: receiving a query request of a terminal, wherein the query request comprises a user attribute parameter and a behavior attribute parameter, the user attribute parameter is a parameter corresponding to user basic information, and the behavior attribute parameter is a parameter corresponding to network operation information; mapping an inverted index of the user identification through a pre-established attribute parameter, acquiring the user identification matched with the query request, and creating a target crowd packet according to the acquired user identification; and sending the target crowd packet to the terminal.
Of course, it will be understood by those skilled in the art that the processing unit may also implement the data processing method provided by any embodiment of the present invention.
Embodiments of the present invention further provide a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the data processing method provided in all the inventive embodiments of this application:
that is, the program when executed by the processor implements: receiving a query request of a terminal, wherein the query request comprises a user attribute parameter and a behavior attribute parameter, the user attribute parameter is a parameter corresponding to user basic information, and the behavior attribute parameter is a parameter corresponding to network operation information; mapping an inverted index of the user identification through a pre-established attribute parameter, acquiring the user identification matched with the query request, and creating a target crowd packet according to the acquired user identification; and sending the target crowd packet to the terminal.
Of course, the computer program stored on the computer-readable storage medium provided by the embodiment of the present invention is not limited to the method operations described above, and may also perform related operations of the data processing method provided by the embodiment of the present invention.
Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (14)

1. A data processing method, comprising:
receiving a query request of a terminal, wherein the query request comprises a user attribute parameter and a behavior attribute parameter, the user attribute parameter is a parameter corresponding to user basic information, and the behavior attribute parameter is a parameter corresponding to network operation information;
mapping an inverted index of the user identification through a pre-established attribute parameter, acquiring the user identification matched with the query request, and creating a target crowd packet according to the acquired user identification;
and sending the target crowd packet to the terminal.
2. The data processing method of claim 1, wherein the inverted index is established by:
according to the stored user data of all users, a user data table which takes the user identification as a main key and takes the user attribute parameters and the behavior attribute parameters as row contents is created, and the user data of each user comprises the user identification, the user attribute parameters and the behavior attribute parameters of the corresponding user;
and establishing an inverted index of the attribute parameter mapping user identification according to the user data table.
3. The data processing method according to claim 2, wherein the user data of all users is stored as follows:
performing hash operation on the user identifier of each user in all the users to obtain a hash value of the user identifier of each user;
determining a storage node of user data of each user according to the hash value of the user identifier of each user;
and storing the user data of each user on the corresponding storage node.
4. The data processing method according to claim 2, wherein the behavior attribute parameter of any one of the users is stored as follows:
determining incremental data of any user, wherein the incremental data is difference data between the behavior attribute parameter of any user acquired by the current data acquisition cycle and the behavior attribute parameter of any user acquired by the previous data acquisition cycle;
and storing the incremental data of any user in the current data acquisition period.
5. The data processing method of claim 2, wherein the creating a user data table with user identification as a primary key and user attribute parameters and behavior attribute parameters as row contents comprises:
setting a data type, wherein the data type of the user attribute parameter is an accurate value type, and the data type of the behavior attribute parameter is a full text type;
and creating a user data table which takes the user identification as a main key and takes the user attribute parameters and the behavior attribute parameters as row contents according to the set data type.
6. The data processing method according to claim 5, wherein said establishing an inverted index of attribute parameter mapping user id according to the user data table comprises:
performing word segmentation on the behavior attribute parameters in the user data table, and obtaining an attribute parameter set according to word segmentation results and the user attribute parameters in the user data table;
and aiming at any attribute parameter in the attribute parameter set, searching a row containing the any attribute parameter in the user data table, adding a user identifier corresponding to the searched row into a user identifier set mapped by the any attribute parameter, traversing each attribute parameter in the attribute parameter set, and obtaining an inverted index of the attribute parameter mapped user identifier.
7. The data processing method of claim 1, wherein the method further comprises:
receiving a data packet export request sent by the terminal aiming at the target crowd packet;
performing hash operation on each user identifier contained in the target crowd packet to obtain a hash value of each user identifier;
dividing the user identifications contained in the target crowd packet into N groups according to the hash value of each user identification, wherein N is an integer greater than 1;
invoking the N threads derives user data for each user identification identified in the N packets.
8. The data processing method of claim 7, wherein said invoking the N threads to derive user data for the user identified by each user identification in the N packets comprises:
and calling N threads to query user data in a rolling query mode, and deriving user data of a user identified by each user identifier in N groups.
9. The data processing method of claim 1, wherein the method further comprises:
receiving a crowd selection request sent by the terminal aiming at the candidate crowd packet;
and selecting the user identification in the candidate crowd packet according to the inverted index and a preset rule to obtain a selected crowd packet.
10. The data processing method of claim 9, wherein selecting the user id from the candidate crowd package according to the inverted index and a predetermined rule to obtain a culled crowd package comprises:
according to the inverted index, selecting a first user identifier set mapped by a first attribute parameter, a second user identifier set mapped by a second attribute parameter and a third user identifier set mapped by a third attribute parameter from the user identifiers of the candidate crowd package, wherein the first attribute parameter, the second attribute parameter and the third attribute parameter are different from each other;
and calculating a union of the first user identification set and the second user identification set, calculating an intersection of the union and the third user identification set, and removing the intersection from the union to obtain the selected crowd packet.
11. A data processing apparatus, comprising:
the terminal comprises a receiving module, a sending module and a processing module, wherein the receiving module is used for receiving a query request of the terminal, the query request comprises a user attribute parameter and a behavior attribute parameter, the user attribute parameter is a parameter corresponding to user basic information, and the behavior attribute parameter is a parameter corresponding to network operation information;
the creating module is used for mapping the inverted index of the user identification through the pre-established attribute parameters, acquiring the user identification matched with the query request, and creating a target crowd packet according to the acquired user identification;
and the sending module is used for sending the target crowd packet to the terminal.
12. A server comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the data processing method according to any one of claims 1 to 10 when executing the program.
13. A data processing system comprising a terminal and a server for performing the data processing method of any one of claims 1 to 10.
14. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the data processing method of any one of claims 1 to 10.
CN202011298991.8A 2020-11-18 2020-11-18 Data processing method, device, server, system and storage medium Pending CN113761102A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011298991.8A CN113761102A (en) 2020-11-18 2020-11-18 Data processing method, device, server, system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011298991.8A CN113761102A (en) 2020-11-18 2020-11-18 Data processing method, device, server, system and storage medium

Publications (1)

Publication Number Publication Date
CN113761102A true CN113761102A (en) 2021-12-07

Family

ID=78786145

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011298991.8A Pending CN113761102A (en) 2020-11-18 2020-11-18 Data processing method, device, server, system and storage medium

Country Status (1)

Country Link
CN (1) CN113761102A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113722574A (en) * 2020-05-26 2021-11-30 武汉瓯越网视有限公司 User information determination method, device, equipment and storage medium based on android

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113722574A (en) * 2020-05-26 2021-11-30 武汉瓯越网视有限公司 User information determination method, device, equipment and storage medium based on android

Similar Documents

Publication Publication Date Title
WO2019024496A1 (en) Enterprise recommendation method and application server
US20080021850A1 (en) Adapting to inexact user input
CN111127051B (en) Multi-channel dynamic attribution method, device, server and storage medium
CN111427971B (en) Business modeling method, device, system and medium for computer system
US20140067548A1 (en) Saving on device functionality for business calendar
CN111046237A (en) User behavior data processing method and device, electronic equipment and readable medium
CN106547646B (en) Data backup and recovery method and data backup and recovery device
CN104750776A (en) Accessing information content in a database platform using metadata
CN110546633A (en) Named entity based category tag addition for documents
CN113032420A (en) Data query method and device and server
US11954086B2 (en) Index data structures and graphical user interface
US11907264B2 (en) Data processing method, data querying method, and server device
CN113761102A (en) Data processing method, device, server, system and storage medium
CN111522840B (en) Label configuration method, device, equipment and computer readable storage medium
CN113177154A (en) Search term recommendation method and device, electronic equipment and storage medium
CN112948396A (en) Data storage method and device, electronic equipment and storage medium
CN110222046B (en) List data processing method, device, server and storage medium
CN109697234B (en) Multi-attribute information query method, device, server and medium for entity
CN116955856A (en) Information display method, device, electronic equipment and storage medium
CN111666302A (en) User ranking query method, device, equipment and storage medium
CN116594683A (en) Code annotation information generation method, device, equipment and storage medium
TWI547888B (en) A method of recording user information and a search method and a server
CN111309932B (en) Comment data query method, comment data query device, comment data query equipment and storage medium
CN113127574A (en) Service data display method, system, equipment and medium based on knowledge graph
CN113704242A (en) Data processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination