CN111177568B - Object pushing method based on multi-source data, electronic device and storage medium - Google Patents

Object pushing method based on multi-source data, electronic device and storage medium Download PDF

Info

Publication number
CN111177568B
CN111177568B CN202010002173.2A CN202010002173A CN111177568B CN 111177568 B CN111177568 B CN 111177568B CN 202010002173 A CN202010002173 A CN 202010002173A CN 111177568 B CN111177568 B CN 111177568B
Authority
CN
China
Prior art keywords
data
neural network
network model
user
attribute data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010002173.2A
Other languages
Chinese (zh)
Other versions
CN111177568A (en
Inventor
喻宁
陈克炎
朱艳乔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202010002173.2A priority Critical patent/CN111177568B/en
Publication of CN111177568A publication Critical patent/CN111177568A/en
Priority to PCT/CN2020/098977 priority patent/WO2021135104A1/en
Application granted granted Critical
Publication of CN111177568B publication Critical patent/CN111177568B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Abstract

The invention relates to a data pushing technology and provides an object pushing method based on multi-source data, an electronic device and a storage medium. The method includes the steps that preprocessing is carried out on acquired object attribute data to serve as training samples, a neural network model is built, then a value range of the number of hidden layer nodes is calculated through a first calculation rule, learning errors of the neural network model when the number of the hidden layer nodes is equal to each value are calculated based on a training sample set and a second calculation rule, the neural network model with the minimum learning error serves as an object level recognition model, a request for pushing an object is responded to by a user, the object in the first distance range is acquired based on position information of the user and then pushed, and if confirmation information of the user is not received within preset duration, the object in the second distance range is acquired and recognized and pushed. The method and the device can improve the accuracy of the grade identification of the object, thereby realizing the accurate pushing of the object required by the user.

Description

Object pushing method based on multi-source data, electronic device and storage medium
Technical Field
The present invention relates to the field of data pushing, and in particular, to an object pushing method based on multi-source data, an electronic device, and a storage medium.
Background
At present, regarding identification pushing of objects, various factors of the objects to be identified are generally compared with reference objects to determine whether to push the objects to a user, but the determination of the reference objects is often a man-made subjective selection, so that the identification accuracy of the method is low, and the deviation between the pushing result and the actual demand of the user is large. Although the technical solutions for automatic identification and push appear in the market, the solutions are usually performed based on some prediction algorithm, and there are technical problems of low accuracy, insufficient flexibility, or high requirements for system performance.
Disclosure of Invention
In view of the above, the present invention provides an object pushing method based on multi-source data, an electronic device and a storage medium, and aims to solve the problem of low accuracy of pushed objects due to inaccurate recognition in the prior art.
In order to achieve the above object, the present invention provides an object pushing method based on multi-source data, including:
an acquisition step: acquiring attribute data of a preset number of objects from a plurality of preset data sources, preprocessing the attribute data, and taking the preprocessed attribute data as a training sample set;
the construction steps are as follows: establishing a neural network model and setting initial parameters of the neural network model, calculating a value range of the number of hidden layer nodes of the neural network model by using a first calculation rule, respectively calculating learning errors of the neural network model when the number of the hidden layer nodes is equal to each value in the value range according to a second calculation rule and the training sample set, and taking the neural network model determined when the value of the learning error is minimum as an object level identification model;
a first pushing step: responding to an object pushing request sent by a user, acquiring position information of the user, searching a first object set in a first distance range corresponding to the position information, reading attribute data of the first object set from a database, inputting the attribute data into an object grade identification model, obtaining grade information of each object in the first object set, recommending the grade information to the user and generating a timestamp; and
a second pushing step: whether confirmation information sent by the user is received within a preset time length is detected based on the timestamp, if the confirmation information is not received, a second object set in a second distance range corresponding to the position information is searched, the first object set is screened out from the second object set to obtain a third object set, attribute data of the third object set is read from the database and input into the object grade identification model, and grade information of all objects in the third object set is obtained and then pushed to the user.
Preferably, the preprocessing includes a data cleaning process and an aggregation process, and the aggregation process includes:
aggregating the repeated data in the attribute data to obtain aggregated data after the repeated data is removed;
determining the number of attribute data before aggregation processing to be A, determining the number of data in the aggregated data to be B, and establishing an index vector according to A and B, wherein the length of the index vector is A, and the dereferencing range of the index vector is an integer in [ -B, -1] < U > 1, B ];
and randomly reading the value of the index vector, acquiring corresponding data from the aggregated data according to the value, and taking the acquired data as the preprocessed attribute data.
Preferably, the first calculation rule includes:
Figure BDA0002353873910000021
wherein K represents the number of samples in the training sample set, n1Representing the number of hidden layer nodes of the neural network model, n representing the number of input layer nodes of the neural network model, i representing [0, n]Is an integer of (1).
Preferably, the second calculation rule includes:
loss(θ)=(yi-η(Xi,w,β))2
wherein loss (θ) represents an error between an actual output and a target output of the neural network model output layer, yiTarget output for ith sample, η (X)iW, β) represents the actual output of the ith sample, XiThe input values representing the input layer nodes of the neural network model, w, β are weights, η represents the learning rate.
Preferably, the value of the second distance is greater than the value of the first distance.
To achieve the above object, the present invention also provides an electronic device, including: the object pushing program based on the multi-source data is executed by the processor, and the following steps are realized:
an acquisition step: acquiring attribute data of a preset number of objects from a plurality of preset data sources, preprocessing the attribute data, and taking the preprocessed attribute data as a training sample set;
the construction steps are as follows: establishing a neural network model and setting initial parameters of the neural network model, calculating a value range of the number of hidden layer nodes of the neural network model by using a first calculation rule, respectively calculating learning errors of the neural network model when the number of the hidden layer nodes is equal to each value in the value range according to a second calculation rule and the training sample set, and taking the neural network model determined when the value of the learning error is minimum as an object level identification model;
a first pushing step: responding to an object pushing request sent by a user, acquiring position information of the user, searching a first object set in a first distance range corresponding to the position information, reading attribute data of the first object set from a database, inputting the attribute data into an object grade identification model, obtaining grade information of each object in the first object set, recommending the grade information to the user and generating a timestamp; and
a second pushing step: whether confirmation information sent by the user is received within a preset time length is detected based on the timestamp, if the confirmation information is not received, a second object set in a second distance range corresponding to the position information is searched, the first object set is screened out from the second object set to obtain a third object set, attribute data of the third object set is read from the database and input into the object grade identification model, and grade information of all objects in the third object set is obtained and then pushed to the user.
Preferably, the preprocessing includes a data cleaning process and an aggregation process, and the aggregation process includes:
aggregating the repeated data in the attribute data to obtain aggregated data after the repeated data is removed;
determining the number of attribute data before aggregation processing to be A, determining the number of data in the aggregated data to be B, and establishing an index vector according to A and B, wherein the length of the index vector is A, and the dereferencing range of the index vector is an integer in [ -B, -1] < U > 1, B ];
and randomly reading the value of the index vector, acquiring corresponding data from the aggregated data according to the value, and taking the acquired data as the preprocessed attribute data.
Preferably, the first calculation rule includes:
Figure BDA0002353873910000031
wherein K represents the number of samples in the training sample set, n1Representing the number of hidden layer nodes of the neural network model, n representing the number of input layer nodes of the neural network model, i representing [0, n]Is an integer of (1).
Preferably, the second calculation rule includes:
loss(θ)=(yi-η(Xi,w,β))2
wherein loss (θ) represents an error between an actual output and a target output of the neural network model output layer, yiTarget output for ith sample, η (X)iW, β) represents the actual output of the ith sample, XiThe input values representing the input layer nodes of the neural network model, w, β are weights, η represents the learning rate.
In order to achieve the above object, the present invention further provides a computer-readable storage medium, where the computer-readable storage medium includes an object pushing program based on multi-source data, and when the object pushing program based on multi-source data is executed by a processor, any step in the object pushing method based on multi-source data as described above is implemented.
According to the object pushing method based on multi-source data, the electronic device and the storage medium, the obtained object attribute data are preprocessed, the learning error of the neural network model of the number of nodes of the hidden layer in each value range is calculated based on the training sample set and the preset calculation rule, the neural network model with the minimum learning error is used as the object grade recognition model, the precision of object grade recognition can be improved, and therefore accurate pushing of the object required by a user is achieved.
Drawings
FIG. 1 is a diagram of an electronic device according to a preferred embodiment of the present invention;
FIG. 2 is a block diagram illustrating a preferred embodiment of the object pushing process based on multi-source data shown in FIG. 1;
FIG. 3 is a flowchart of a preferred embodiment of a multi-source data-based object pushing method according to the present invention;
the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a schematic diagram of an electronic device 1 according to a preferred embodiment of the invention is shown.
The electronic device 1 includes but is not limited to: memory 11, processor 12, display 13, and network interface 14. The electronic device 1 is connected to a network through a network interface 14 to obtain raw data. The network may be a wireless or wired network such as an Intranet (Intranet), the Internet (Internet), a Global System for mobile communications (GSM), Wideband Code Division Multiple Access (WCDMA), a 4G network, a 5G network, Bluetooth (Bluetooth), Wi-Fi, or a communication network.
The memory 11 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the storage 11 may be an internal storage unit of the electronic device 1, such as a hard disk or a memory of the electronic device 1. In other embodiments, the memory 11 may also be an external storage device of the electronic apparatus 1, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided in the electronic apparatus 1. Of course, the memory 11 may also comprise both an internal memory unit of the electronic apparatus 1 and an external memory device thereof. In this embodiment, the memory 11 is generally used for storing an operating system and various application software installed in the electronic device 1, such as program codes of the object pushing program 10 based on multi-source data. Further, the memory 11 may also be used to temporarily store various types of data that have been output or are to be output.
Processor 12 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 12 is generally used for controlling the overall operation of the electronic device 1, such as performing data interaction or communication related control and processing. In this embodiment, the processor 12 is configured to run a program code or process data stored in the memory 11, for example, run a program code of the object pushing program 10 based on multi-source data.
The display 13 may be referred to as a display screen or display unit. In some embodiments, the display 13 may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an Organic Light-emitting diode (OLED) touch screen, or the like. The display 13 is used for displaying information processed in the electronic device 1 and for displaying a visual work interface, for example, results of data statistics.
The network interface 14 may optionally comprise a standard wired interface, a wireless interface (e.g. WI-FI interface), the network interface 14 typically being used for establishing a communication connection between the electronic apparatus 1 and other electronic devices.
Fig. 1 shows only the electronic device 1 with components 11-14 and the multi-source data based object push program 10, but it is to be understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead.
Optionally, the electronic device 1 may further comprise a user interface, which may comprise a Display (Display), an input unit such as a keyboard, and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an Organic Light-Emitting Diode (OLED) touch screen, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the electronic apparatus 1 and for displaying a visualized user interface.
The electronic device 1 may further include a Radio Frequency (RF) circuit, a sensor, an audio circuit, and the like, which are not described in detail herein.
In the above embodiment, when the processor 12 executes the multi-source data based object pushing program 10 stored in the memory 11, the following steps may be implemented:
an acquisition step: acquiring attribute data of a preset number of objects from a plurality of preset data sources, preprocessing the attribute data, and taking the preprocessed attribute data as a training sample set;
the construction steps are as follows: establishing a neural network model and setting initial parameters of the neural network model, calculating a value range of the number of hidden layer nodes of the neural network model by using a first calculation rule, respectively calculating learning errors of the neural network model when the number of the hidden layer nodes is equal to each value in the value range according to a second calculation rule and the training sample set, and taking the neural network model determined when the value of the learning error is minimum as an object level identification model;
a first pushing step: responding to an object pushing request sent by a user, acquiring position information of the user, searching a first object set in a first distance range corresponding to the position information, reading attribute data of the first object set from a database, inputting the attribute data into an object grade identification model, obtaining grade information of each object in the first object set, recommending the grade information to the user and generating a timestamp; and
a second pushing step: whether confirmation information sent by the user is received within a preset time length is detected based on the timestamp, if the confirmation information is not received, a second object set in a second distance range corresponding to the position information is searched, the first object set is screened out from the second object set to obtain a third object set, attribute data of the third object set is read from the database and input into the object grade identification model, and grade information of all objects in the third object set is obtained and then pushed to the user.
The storage device may be the memory 11 of the electronic apparatus 1, or may be another storage device communicatively connected to the electronic apparatus 1.
For a detailed description of the above steps, please refer to the following description of fig. 2 regarding a flowchart of an embodiment of the object pushing program 10 based on multi-source data and fig. 3 regarding a flowchart of an embodiment of an object pushing method based on multi-source data.
In other embodiments, the multi-source data based object pushing program 10 may be divided into a plurality of modules, which are stored in the memory 12 and executed by the processor 13 to accomplish the present invention. The modules referred to herein are referred to as a series of computer program instruction segments capable of performing specified functions.
Referring to fig. 2, a block diagram of an embodiment of the object pushing program 10 based on multi-source data in fig. 1 is shown. In this embodiment, the object pushing program 10 based on multi-source data may be divided into: the system comprises an acquisition module 110, a construction module 120, a first pushing module 130 and a second pushing module 140.
The obtaining module 110 is configured to obtain attribute data of a preset number of objects from a plurality of preset data sources, perform preprocessing on the attribute data, and use the preprocessed attribute data as a training sample set.
In the present embodiment, a case will be described by taking an object as a business turn as an example. The predetermined data source may be a business district information search website of a third party, or a website of government public data, GIS information, and the like, and the attribute data of the business district includes the type of the business district (e.g., gourmet, shopping, beauty, and the like), the number of shops of the business district, per-capita consumption of the shops, daily average person traffic of the business district, name of the business district, and the like, and in addition, information of a city to which the business district belongs (e.g., city resident population size, city GDP, business resource aggregation, city per-capita income) may be acquired. It can be understood that, as time goes on, the attribute data of the business circles may change greatly, and in order to make the obtained data more conform to the current actual situation, the attribute data of the business circles may also be screened in the time dimension, and the data of the business circles within a preset time (for example, within two years from the current time) is retained.
Preprocessing can be executed on the acquired business circle attribute data, wherein the preprocessing comprises data cleaning processing and aggregation processing, the data cleaning processing comprises missing value removal and invalid value removal, because the business district data acquired by the internet crawler technology may have the problem of partial data coincidence, the repeated data in the business district attribute data can be aggregated to obtain the aggregated data after the duplication is removed, the number of the attribute data before aggregation processing is determined to be A, the number of the data in the aggregated data is determined to be B, an index vector is established according to A and B, wherein, the length of the index vector is A, the value range of the index vector is an integer in [ -B, -1 ]. U [1, B ], the value of the index vector is randomly read, and acquiring corresponding data from the aggregated data according to the value, and taking the acquired data as the preprocessed attribute data. Further, the attribute data may be normalized so that the value of the numeric attribute data of the quotient circle is in the range of [0, 1 ].
The building module 120 is configured to build a neural network model and set initial parameters of the neural network model, calculate a value range of the number of hidden layer nodes of the neural network model by using a first calculation rule, respectively calculate learning errors of the neural network model when the number of hidden layer nodes is equal to each value in the value range according to a second calculation rule and the training sample set, and use the neural network model determined when the value of the learning error is minimum as an object level identification model.
In this embodiment, a neural network model is constructed, and a three-layer neural network model having an input layer, a hidden layer, and an output layer is established. And taking the characteristic value of the attribute data of the business circles as an independent variable of the neural network, labeling different grades for the business circles in advance, and taking the grades labeled for the business circles as a dependent variable of the neural network to generate a training sample set. The characteristic values of the attribute data of the business circle comprise the number of shops of the business circle, the per-capita consumption of the shops, the daily average traffic of the business circle, the resident population number of the city to which the business circle belongs and the GDP of the city to which the business circle belongs. The marked grades of the trade circle comprise: a-level business circles, B-level business circles, C-level business circles and D-level business circles. That is, the number of nodes of the input layer is determined to be 5, and the number of nodes of the output layer is determined to be 4. If the number of the hidden layer nodes is too small, the information which can be acquired by the neural network is too small, and enough connection weight combination numbers cannot be generated to meet the learning of the sample, and if the number is too large, the training time of the model is increased, the fault tolerance of the neural network is poor, the test error is increased, and the generalization capability of the model is reduced. Therefore, in this embodiment, the value range of the number of the hidden layer nodes is obtained by calculation according to the number of the nodes of the input layer, the number of the nodes of the output layer, and a first calculation rule, where the first calculation rule includes:
Figure BDA0002353873910000081
wherein n is1For the number of nodes of the hidden layer, n is the number of nodes of the input layer, m is the number of nodes of the output layer, a is a constant, and a has a value range of [1, 10 ]]. I.e. the number of hidden layer nodes is at most 13 and at least 4. And when the number of the nodes of the hidden layer is between 4 and 13, comparing the learning error of the network training result to select the number of the nodes of the optimal hidden layer of the hidden layer.
In one embodiment, the first calculation rule further comprises:
Figure BDA0002353873910000082
wherein K represents the number of samples in the training sample set, n1Representing the number of hidden layer nodes of the neural network model, n representing the number of input layer nodes of the neural network model, i representing [0, n]Is an integer of (1).
Setting initial parameters of the neural network model comprises the following steps: the initial weight of the neural network is randomly assigned to be a value of [ -1, 1], so that the situation that the network enters a saturation region too early in calculation can be effectively avoided, the learning error is set to be 0.0005, the learning rate eta is 0.05, the maximum cycle number is 5000, the transfer functions of the hidden layer and the output layer are Tang function, and the situation that the gradient disappears can be avoided by using the activation function.
The specific training steps include: according to the preset parameters of the neural network, firstly setting the initial value of the number of nodes of the hidden layer as 4, calculating the output of each node of the hidden layer and the output layer, then calculating the error between the actual output of the output layer and the target output, comparing the target output error with the preset learning error, when the target output error is greater than the preset learning error, adjusting the weight of the neural network until the target output errors of all the nodes of the output layer are less than the learning error, then increasing the number of the nodes of the hidden layer one by one and repeating the network training, wherein the value range of the number of the nodes of the hidden layer is between 4 and 13 until the number of the nodes of the hidden layer is increased to 13, and stopping the operation. And respectively comparing the learning errors of the neural networks under different hidden layer node numbers, and taking the neural network model with the minimum learning error as an object (business circle) grade identification model.
The second calculation rule, which is the error formula of the actual output and the target output of the output layer, is:
loss(θ)=(yi-η(Xi,w,β))2
where loss (θ) represents the error between the actual output and the target output of the neural network model output layer, yiTarget output for ith sample, η (X)iW, β) represents the actual output of the ith sample, XiThe input values representing the input layer nodes of the neural network model, w, β are weights, η represents the learning rate.
The first pushing module 130 is configured to, in response to an object pushing request sent by a user, obtain location information of the user, search for a first object set within a first distance range corresponding to the location information, read attribute data of the first object set from a database, input the attribute data into the object rank identification model, obtain rank information of each object in the first object set, recommend the obtained rank information to the user, and generate a timestamp.
In this embodiment, the present solution is described by taking an object as a business turn as an example. The conditions for the user to trigger the electronic device to acquire the pushed business turn can be multiple, when a request for pushing an object sent by the user is detected, the position information of the user is acquired, or when the behavior of the user is detected to meet the conditions recommended by the business turn, the request for pushing the object sent by the user is responded, the position information of the user is acquired by using a GPS signal, the request for pushing the business turn sent by the user can be triggered by a preset recommendation entry, and the preset recommendation entry can be an entry for acquiring the recommended business turn by clicking, an entry for acquiring the recommended business turn by shaking a mobile phone and the like in a user interaction interface.
After the position information of the user is obtained, a first business district set corresponding to the position information of the user within a first distance range (for example, 5 kilometers) is searched, attribute data of each business district in the first business district set are read from a preset database (for example, a related business district information searching website), the read attribute data are input into an object grade recognition model, grade information of each business district in the first business district set is obtained, the grade information is fed back to the user, and a timestamp is generated.
The second pushing module 140 is configured to detect whether confirmation information sent by the user is received within a preset time period based on the timestamp, if the confirmation information is not received, search a second object set within a second distance range corresponding to the location information, screen the first object set from the second object set to obtain a third object set, read attribute data of the third object set from the database, input the attribute data into the object level identification model, and push the obtained level information of each object in the third object set to the user.
In this embodiment, when a business turn class in a first distance range is pushed to a user, whether confirmation information sent by the user is received within a preset time (for example, 60 seconds) is detected according to a generated timestamp, where the confirmation information may be whether a business turn meeting a user requirement class exists in the first distance range fed back by the user in an interactive interface, when none of the business turn classes in the first distance range meets the user requirement business turn class (i.e., the confirmation information is not received), a second business turn set corresponding to the position information of the user within a second distance range (for example, 10 kilometers) is searched, where a value of the second distance is greater than a value of the first distance, in order to avoid repeatedly pushing business turns in the first business turn set, a third business turn set is obtained after the first business turn set is screened out from the second business turn set, and attribute data of each business turn in the third business turn set is read from a preset database, and inputting the attribute data into the object grade identification model to obtain grade information of a third business circle set and feeding the grade information back to the user.
In addition, the invention also provides an object pushing method based on the multi-source data. Fig. 3 is a schematic method flow diagram illustrating an embodiment of the object pushing method based on multi-source data according to the present invention. When the processor 12 of the electronic device 1 executes the multi-source data-based object pushing program 10 stored in the memory 11, the following steps are implemented for the multi-source data-based object pushing method:
step S10: acquiring attribute data of a preset number of objects from a plurality of preset data sources, preprocessing the attribute data, and taking the preprocessed attribute data as a training sample set.
In the present embodiment, a case will be described by taking an object as a business turn as an example. The predetermined data source may be a business district information search website of a third party, or a website of government public data, GIS information, and the like, and the attribute data of the business district includes the type of the business district (e.g., gourmet, shopping, beauty, and the like), the number of shops of the business district, per-capita consumption of the shops, daily average person traffic of the business district, name of the business district, and the like, and in addition, information of a city to which the business district belongs (e.g., city resident population size, city GDP, business resource aggregation, city per-capita income) may be acquired. It can be understood that, as time goes on, the attribute data of the business circles may change greatly, and in order to make the obtained data more conform to the current actual situation, the attribute data of the business circles may also be screened in the time dimension, and the data of the business circles within a preset time (for example, within two years from the current time) is retained.
Preprocessing the acquired business circle attribute data, wherein the preprocessing comprises data cleaning processing and aggregation processing, the data cleaning processing comprises missing value removal and invalid value removal, because the business district data acquired by the internet crawler technology may have the problem of partial data coincidence, the repeated data in the business district attribute data can be aggregated to obtain the aggregated data after the duplication is removed, the number of the attribute data before aggregation processing is determined to be A, the number of the data in the aggregated data is determined to be B, an index vector is established according to A and B, wherein, the length of the index vector is A, the value range of the index vector is an integer in [ -B, -1 ]. U [1, B ], the value of the index vector is randomly read, and acquiring corresponding data from the aggregated data according to the value, and taking the acquired data as the preprocessed attribute data. Further, the attribute data may be normalized so that the value of the numeric attribute data of the quotient circle is in the range of [0, 1 ].
Step S20: the method comprises the steps of constructing a neural network model, setting initial parameters of the neural network model, calculating a value range of the number of hidden layer nodes of the neural network model by using a first calculation rule, respectively calculating learning errors of the neural network model when the number of the hidden layer nodes is equal to each value in the value range according to a second calculation rule and a training sample set, and taking the neural network model determined when the value of the learning error is minimum as an object level recognition model.
In this embodiment, a neural network model is constructed, and a three-layer neural network model having an input layer, a hidden layer, and an output layer is established. And taking the characteristic value of the attribute data of the business circles as an independent variable of the neural network, labeling different grades for the business circles in advance, and taking the grades labeled for the business circles as a dependent variable of the neural network to generate a training sample set. The characteristic values of the attribute data of the business circle comprise the number of shops of the business circle, the per-capita consumption of the shops, the daily average traffic of the business circle, the resident population number of the city to which the business circle belongs and the GDP of the city to which the business circle belongs. The marked grades of the trade circle comprise: a-level business circles, B-level business circles, C-level business circles and D-level business circles. That is, the number of nodes of the input layer is determined to be 5, and the number of nodes of the output layer is determined to be 4. If the number of the hidden layer nodes is too small, the information which can be acquired by the neural network is too small, and enough connection weight combination numbers cannot be generated to meet the learning of the sample, and if the number is too large, the training time of the model is increased, the fault tolerance of the neural network is poor, the test error is increased, and the generalization capability of the model is reduced. Therefore, in this embodiment, the value range of the number of the hidden layer nodes is obtained by calculation according to the number of the nodes of the input layer, the number of the nodes of the output layer, and a first calculation rule, where the first calculation rule includes:
Figure BDA0002353873910000121
wherein n is1For the number of nodes of the hidden layer, n is the number of nodes of the input layer, m is the number of nodes of the output layer, a is a constant, and a has a value range of [1, 10 ]]. I.e. the number of hidden layer nodes is at most 13 and at least 4. And when the number of the nodes of the hidden layer is between 4 and 13, comparing the learning error of the network training result to select the number of the nodes of the optimal hidden layer of the hidden layer.
In one embodiment, the first calculation rule further comprises:
Figure BDA0002353873910000122
wherein K represents the number of samples in the training sample set, n1Representing the number of hidden layer nodes of the neural network model, n representing the number of input layer nodes of the neural network model, i representing [0, n]Is an integer of (1).
Setting initial parameters of the neural network model comprises the following steps: the initial weight of the neural network is randomly assigned to be a value of [ -1, 1], so that the situation that the network enters a saturation region too early in calculation can be effectively avoided, the learning error is set to be 0.0005, the learning rate eta is 0.05, the maximum cycle number is 5000, the transfer functions of the hidden layer and the output layer are Tang function, and the situation that the gradient disappears can be avoided by using the activation function.
The specific training steps include: according to the preset parameters of the neural network, firstly setting the initial value of the number of nodes of the hidden layer as 4, calculating the output of each node of the hidden layer and the output layer, then calculating the error between the actual output of the output layer and the target output, comparing the target output error with the preset learning error, when the target output error is greater than the preset learning error, adjusting the weight of the neural network until the target output errors of all the nodes of the output layer are less than the learning error, then increasing the number of the nodes of the hidden layer one by one and repeating the network training, wherein the value range of the number of the nodes of the hidden layer is between 4 and 13 until the number of the nodes of the hidden layer is increased to 13, and stopping the operation. And respectively comparing the learning errors of the neural networks under different hidden layer node numbers, and taking the neural network model with the minimum learning error as an object (business circle) grade identification model.
The second calculation rule, which is the error formula of the actual output and the target output of the output layer, is:
loss(θ)=(yi-η(Xi,w,β))2
where loss (θ) represents the error between the actual output and the target output of the neural network model output layer, yiTarget output for ith sample, η (X)iW, β) represents the actual output of the ith sample, XiThe input values representing the input layer nodes of the neural network model, w, β are weights, η represents the learning rate.
Step S30: responding to an object pushing request sent by a user, acquiring the position information of the user, searching a first object set in a first distance range corresponding to the position information, reading attribute data of the first object set from a database, inputting the attribute data into an object grade identification model, obtaining the grade information of each object in the first object set, recommending the grade information to the user, and generating a timestamp.
In this embodiment, the present solution is described by taking an object as a business turn as an example. The conditions for the user to trigger the electronic device to acquire the pushed business turn can be multiple, when a request for pushing an object sent by the user is detected, the position information of the user is acquired, or when the behavior of the user is detected to meet the conditions recommended by the business turn, the request for pushing the object sent by the user is responded, the position information of the user is acquired by using a GPS signal, the request for pushing the business turn sent by the user can be triggered by a preset recommendation entry, and the preset recommendation entry can be an entry for acquiring the recommended business turn by clicking, an entry for acquiring the recommended business turn by shaking a mobile phone and the like in a user interaction interface.
After the position information of the user is obtained, a first business district set corresponding to the position information of the user within a first distance range (for example, 5 kilometers) is searched, attribute data of each business district in the first business district set are read from a preset database (for example, a related business district information searching website), the read attribute data are input into an object grade recognition model, grade information of each business district in the first business district set is obtained, the grade information is fed back to the user, and a timestamp is generated.
Step S40: whether confirmation information sent by the user is received within a preset time length is detected based on the timestamp, if the confirmation information is not received, a second object set in a second distance range corresponding to the position information is searched, the first object set is screened out from the second object set to obtain a third object set, attribute data of the third object set is read from the database and input into the object grade identification model, and grade information of all objects in the third object set is obtained and then pushed to the user.
In this embodiment, when a business turn level in a first distance range is pushed to a user, whether confirmation information sent by the user is received within a preset time (for example, 60 seconds) is detected according to a generated timestamp, where the confirmation information may be whether a business turn meeting a user requirement level exists in the first distance range fed back by the user in an interactive interface, when none of the business turn levels in the first distance range meets the user requirement business turn level (i.e., the confirmation information is not received), a second business turn set corresponding to the position information of the user within a second distance range (for example, 10 kilometers) is searched, where a value of the second distance is greater than a value of the first distance, to avoid repeatedly pushing business turns in the first business turn set, a third business turn set is obtained by screening out the first business turn set from the second business turn set, and attribute data of each business turn in the third business turn set is read from a preset database, and inputting the attribute data into the object grade identification model to obtain grade information of a third business circle set and feeding the grade information back to the user.
Furthermore, the embodiment of the present invention also provides a computer-readable storage medium, which may be any one or any combination of a hard disk, a multimedia card, an SD card, a flash memory card, an SMC, a Read Only Memory (ROM), an Erasable Programmable Read Only Memory (EPROM), a portable compact disc read only memory (CD-ROM), a USB memory, and the like. The computer readable storage medium includes a multi-source data-based object pushing program 10, and when executed by a processor, the multi-source data-based object pushing program 10 implements the following operations:
an acquisition step: acquiring attribute data of a preset number of objects from a plurality of preset data sources, preprocessing the attribute data, and taking the preprocessed attribute data as a training sample set;
the construction steps are as follows: establishing a neural network model and setting initial parameters of the neural network model, calculating a value range of the number of hidden layer nodes of the neural network model by using a first calculation rule, respectively calculating learning errors of the neural network model when the number of the hidden layer nodes is equal to each value in the value range according to a second calculation rule and the training sample set, and taking the neural network model determined when the value of the learning error is minimum as an object level identification model;
a first pushing step: responding to an object pushing request sent by a user, acquiring position information of the user, searching a first object set in a first distance range corresponding to the position information, reading attribute data of the first object set from a database, inputting the attribute data into an object grade identification model, obtaining grade information of each object in the first object set, recommending the grade information to the user and generating a timestamp; and
a second pushing step: whether confirmation information sent by the user is received within a preset time length is detected based on the timestamp, if the confirmation information is not received, a second object set in a second distance range corresponding to the position information is searched, the first object set is screened out from the second object set to obtain a third object set, attribute data of the third object set is read from the database and input into the object grade identification model, and grade information of all objects in the third object set is obtained and then pushed to the user.
The specific implementation of the computer-readable storage medium of the present invention is substantially the same as the above-mentioned specific implementation of the object pushing method based on multi-source data, and is not described herein again.
It should be noted that the above-mentioned numbers of the embodiments of the present invention are merely for description, and do not represent the merits of the embodiments. And the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, apparatus, article, or method that includes the element.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention essentially or contributing to the prior art can be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) as described above and includes several instructions for enabling a terminal device (such as a mobile phone, a computer, an electronic device, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (6)

1. An object pushing method based on multi-source data is applied to an electronic device, and is characterized in that the method comprises the following steps:
an acquisition step: acquiring attribute data of a preset number of objects from a plurality of preset data sources, preprocessing the attribute data, and taking the preprocessed attribute data as a training sample set;
the construction steps are as follows: establishing a neural network model and setting initial parameters of the neural network model, calculating a value range of the number of hidden layer nodes of the neural network model by using a first calculation rule, respectively calculating learning errors of the neural network model when the number of the hidden layer nodes is equal to each value in the value range according to a second calculation rule and the training sample set, and taking the neural network model determined when the value of the learning error is minimum as an object level identification model;
a first pushing step: responding to an object pushing request sent by a user, acquiring position information of the user, searching a first object set in a first distance range corresponding to the position information, reading attribute data of the first object set from a database, inputting the attribute data into an object grade identification model, obtaining grade information of each object in the first object set, recommending the grade information to the user and generating a timestamp; and
a second pushing step: whether confirmation information sent by the user is received within a preset time length is detected based on the timestamp, if the confirmation information is not received, a second object set in a second distance range corresponding to the position information is searched, the first object set is screened out from the second object set to obtain a third object set, attribute data of the third object set is read from the database and input into the object grade identification model, and grade information of all objects in the third object set is obtained and then pushed to the user;
the first calculation rule includes:
Figure FDA0002576899210000011
wherein K represents the number of samples in the training sample set, n1Representing the number of hidden layer nodes of the neural network model, n representing the number of input layer nodes of the neural network model, i representing [0, n]An integer of (d);
the second calculation rule includes:
loss(θ)=(yi-η(Xi,w,β))2
wherein loss (θ) represents an error between an actual output and a target output of the neural network model output layer, yiTarget output for ith sample, η (X)iW, β) represents the actual output of the ith sample, XiThe input values representing the input layer nodes of the neural network model, w, β are weights, η represents the learning rate.
2. The multi-source data-based object pushing method according to claim 1, wherein the preprocessing comprises a data cleaning process and an aggregation process, and the aggregation process comprises:
aggregating the repeated data in the attribute data to obtain aggregated data after the repeated data is removed;
determining the number of attribute data before aggregation processing to be A, determining the number of data in the aggregated data to be B, and establishing an index vector according to A and B, wherein the length of the index vector is A, and the dereferencing range of the index vector is an integer in [ -B, -1] < U > 1, B ];
and randomly reading the value of the index vector, acquiring corresponding data from the aggregated data according to the value, and taking the acquired data as the preprocessed attribute data.
3. The multi-source-data-based object pushing method according to any one of claims 1 to 2, wherein the value of the second distance is greater than the value of the first distance.
4. An electronic device, comprising a memory and a processor, wherein an object pushing program based on multi-source data is stored in the memory, and the object pushing program based on multi-source data is executed by the processor, and the following steps are implemented:
an acquisition step: acquiring attribute data of a preset number of objects from a plurality of preset data sources, preprocessing the attribute data, and taking the preprocessed attribute data as a training sample set;
the construction steps are as follows: establishing a neural network model and setting initial parameters of the neural network model, calculating a value range of the number of hidden layer nodes of the neural network model by using a first calculation rule, respectively calculating learning errors of the neural network model when the number of the hidden layer nodes is equal to each value in the value range according to a second calculation rule and the training sample set, and taking the neural network model determined when the value of the learning error is minimum as an object level identification model;
a first pushing step: responding to an object pushing request sent by a user, acquiring position information of the user, searching a first object set in a first distance range corresponding to the position information, reading attribute data of the first object set from a database, inputting the attribute data into an object grade identification model, obtaining grade information of each object in the first object set, recommending the grade information to the user and generating a timestamp; and
a second pushing step: whether confirmation information sent by the user is received within a preset time length is detected based on the timestamp, if the confirmation information is not received, a second object set in a second distance range corresponding to the position information is searched, the first object set is screened out from the second object set to obtain a third object set, attribute data of the third object set is read from the database and input into the object grade identification model, and grade information of all objects in the third object set is obtained and then pushed to the user;
the first calculation rule includes:
Figure FDA0002576899210000021
wherein K represents the number of samples in the training sample set, n1Representing the number of hidden layer nodes of the neural network model, n representing the number of input layer nodes of the neural network model, i representing [0, n]An integer of (d);
the second calculation rule includes:
loss(θ)=(yi-η(Xi,w,β))2
wherein loss (θ) represents an error between an actual output and a target output of the neural network model output layer, yiTarget output for ith sample, η (X)iW, β) represents the actual output of the ith sample, XiThe input values representing the input layer nodes of the neural network model, w, β are weights, η represents the learning rate.
5. The electronic device of claim 4, wherein the pre-processing comprises a data washing process and an aggregation process, the aggregation process comprising:
aggregating the repeated data in the attribute data to obtain aggregated data after the repeated data is removed;
determining the number of attribute data before aggregation processing to be A, determining the number of data in the aggregated data to be B, and establishing an index vector according to A and B, wherein the length of the index vector is A, and the dereferencing range of the index vector is an integer in [ -B, -1] < U > 1, B ];
and randomly reading the value of the index vector, acquiring corresponding data from the aggregated data according to the value, and taking the acquired data as the preprocessed attribute data.
6. A computer-readable storage medium, wherein the computer-readable storage medium includes a multi-source data-based object pushing program, and when the multi-source data-based object pushing program is executed by a processor, the steps of the multi-source data-based object pushing method according to any one of claims 1 to 3 are implemented.
CN202010002173.2A 2020-01-02 2020-01-02 Object pushing method based on multi-source data, electronic device and storage medium Active CN111177568B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010002173.2A CN111177568B (en) 2020-01-02 2020-01-02 Object pushing method based on multi-source data, electronic device and storage medium
PCT/CN2020/098977 WO2021135104A1 (en) 2020-01-02 2020-06-29 Multi-source data-based object pushing method and apparatus, device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010002173.2A CN111177568B (en) 2020-01-02 2020-01-02 Object pushing method based on multi-source data, electronic device and storage medium

Publications (2)

Publication Number Publication Date
CN111177568A CN111177568A (en) 2020-05-19
CN111177568B true CN111177568B (en) 2020-08-21

Family

ID=70657806

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010002173.2A Active CN111177568B (en) 2020-01-02 2020-01-02 Object pushing method based on multi-source data, electronic device and storage medium

Country Status (2)

Country Link
CN (1) CN111177568B (en)
WO (1) WO2021135104A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111177568B (en) * 2020-01-02 2020-08-21 平安科技(深圳)有限公司 Object pushing method based on multi-source data, electronic device and storage medium
CN112511632B (en) * 2020-12-03 2022-10-11 中国平安财产保险股份有限公司 Object pushing method, device and equipment based on multi-source data and storage medium
CN112364135B (en) * 2020-12-03 2023-11-07 中国平安财产保险股份有限公司 Object pushing method, device, equipment and storage medium based on multi-source data

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5596681A (en) * 1993-10-22 1997-01-21 Nippondenso Co., Ltd. Method of determining an optimal number of neurons contained in hidden layers of a neural network
CN104572937A (en) * 2014-12-30 2015-04-29 杭州云象网络技术有限公司 Offline friend recommendation method based on indoor living circle
CN107578093A (en) * 2017-09-14 2018-01-12 长安大学 The Elman neural network dynamic Forecasting Methodologies of Landslide Deformation
CN107784372A (en) * 2016-08-24 2018-03-09 阿里巴巴集团控股有限公司 Forecasting Methodology, the device and system of destination object attribute
CN108665007A (en) * 2018-05-22 2018-10-16 阿里巴巴集团控股有限公司 A kind of recommendation method, apparatus and electronic equipment based on multi-categorizer
CN108875821A (en) * 2018-06-08 2018-11-23 Oppo广东移动通信有限公司 The training method and device of disaggregated model, mobile terminal, readable storage medium storing program for executing
CN109819002A (en) * 2017-11-22 2019-05-28 腾讯科技(深圳)有限公司 Data push method and device, storage medium and electronic device

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8170971B1 (en) * 2011-09-28 2012-05-01 Ava, Inc. Systems and methods for providing recommendations based on collaborative and/or content-based nodal interrelationships
CN103577679A (en) * 2012-08-10 2014-02-12 深圳市龙电电气有限公司 Real-time computing method for theoretical line loss of low-voltage distribution room
CN106101222A (en) * 2016-06-08 2016-11-09 腾讯科技(深圳)有限公司 The method for pushing of information and device
US10254935B2 (en) * 2016-06-29 2019-04-09 Google Llc Systems and methods of providing content selection
CN108256052B (en) * 2018-01-15 2023-07-11 成都达拓智通科技有限公司 Tri-tracking-based potential customer identification method for automobile industry
CN109447960A (en) * 2018-10-18 2019-03-08 神州数码医疗科技股份有限公司 A kind of object identifying method and device
CN109800325B (en) * 2018-12-26 2021-10-26 北京达佳互联信息技术有限公司 Video recommendation method and device and computer-readable storage medium
CN111177568B (en) * 2020-01-02 2020-08-21 平安科技(深圳)有限公司 Object pushing method based on multi-source data, electronic device and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5596681A (en) * 1993-10-22 1997-01-21 Nippondenso Co., Ltd. Method of determining an optimal number of neurons contained in hidden layers of a neural network
CN104572937A (en) * 2014-12-30 2015-04-29 杭州云象网络技术有限公司 Offline friend recommendation method based on indoor living circle
CN107784372A (en) * 2016-08-24 2018-03-09 阿里巴巴集团控股有限公司 Forecasting Methodology, the device and system of destination object attribute
CN107578093A (en) * 2017-09-14 2018-01-12 长安大学 The Elman neural network dynamic Forecasting Methodologies of Landslide Deformation
CN109819002A (en) * 2017-11-22 2019-05-28 腾讯科技(深圳)有限公司 Data push method and device, storage medium and electronic device
CN108665007A (en) * 2018-05-22 2018-10-16 阿里巴巴集团控股有限公司 A kind of recommendation method, apparatus and electronic equipment based on multi-categorizer
CN108875821A (en) * 2018-06-08 2018-11-23 Oppo广东移动通信有限公司 The training method and device of disaggregated model, mobile terminal, readable storage medium storing program for executing

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
销量数据挖掘技术及电子商务应用研究;周昊明;《中国优秀硕士学位论文全文数据库 信息科技辑》;20141015;I139-80 *

Also Published As

Publication number Publication date
WO2021135104A1 (en) 2021-07-08
CN111177568A (en) 2020-05-19

Similar Documents

Publication Publication Date Title
CN111177568B (en) Object pushing method based on multi-source data, electronic device and storage medium
CN111210269A (en) Object identification method based on big data, electronic device and storage medium
CN109993627B (en) Recommendation method, recommendation model training device and storage medium
CN112329847A (en) Abnormity detection method and device, electronic equipment and storage medium
CN112069276A (en) Address coding method and device, computer equipment and computer readable storage medium
CN109522923A (en) Customer address polymerization, device and computer readable storage medium
CN111369148A (en) Object index monitoring method, electronic device and storage medium
CN112115372B (en) Parking lot recommendation method and device
CN113434762A (en) Association pushing method, device and equipment based on user information and storage medium
CN112948526A (en) User portrait generation method and device, electronic equipment and storage medium
CN111950623A (en) Data stability monitoring method and device, computer equipment and medium
CN115564486A (en) Data pushing method, device, equipment and medium
CN114090601B (en) Data screening method, device, equipment and storage medium
CN113722437B (en) User tag identification method, device, equipment and medium based on artificial intelligence
CN113010788B (en) Information pushing method and device, electronic equipment and computer readable storage medium
CN115051863A (en) Abnormal flow detection method and device, electronic equipment and readable storage medium
KR20190018807A (en) Apparatus and method for providing information through analysis of movement patterns between stock prices
CN110879863B (en) Cross-domain search method and cross-domain search device
JP5538459B2 (en) Information processing apparatus and method
CN113076451A (en) Abnormal behavior recognition and risk model library establishing method and device and electronic equipment
CN112966085B (en) Man-machine conversation intelligent control method and device, electronic equipment and storage medium
CN113051475B (en) Content recommendation method, device, electronic equipment and readable storage medium
CN112328779B (en) Training sample construction method, device, terminal equipment and storage medium
CN112380418B (en) Data processing method and system based on web crawler and cloud platform
CN110909130B (en) Text theme extraction and analysis method and device and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant