CN110427963A - A kind of vehicle source information processing method, system and equipment - Google Patents

A kind of vehicle source information processing method, system and equipment Download PDF

Info

Publication number
CN110427963A
CN110427963A CN201910544433.6A CN201910544433A CN110427963A CN 110427963 A CN110427963 A CN 110427963A CN 201910544433 A CN201910544433 A CN 201910544433A CN 110427963 A CN110427963 A CN 110427963A
Authority
CN
China
Prior art keywords
information
user
vehicle source
source information
vehicle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910544433.6A
Other languages
Chinese (zh)
Inventor
张少杰
高迪
林新亮
靳胜强
巩仔明
邱慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Best Faith Racket (beijing) Mdt Infotech Ltd
Original Assignee
Best Faith Racket (beijing) Mdt Infotech Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Best Faith Racket (beijing) Mdt Infotech Ltd filed Critical Best Faith Racket (beijing) Mdt Infotech Ltd
Priority to CN201910544433.6A priority Critical patent/CN110427963A/en
Publication of CN110427963A publication Critical patent/CN110427963A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0623Item investigation
    • G06Q30/0625Directed, with specific intent or strategy
    • G06Q30/0629Directed, with specific intent or strategy for generating comparisons

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Finance (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Accounting & Taxation (AREA)
  • Artificial Intelligence (AREA)
  • General Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention provides a kind of vehicle source information processing method, system and equipment, wherein method includes: to be classified according to user related information, the information of vehicles in the similarity of a plurality of vehicle source information, vehicle source information to vehicle source information, obtains effective vehicle source information and junk information;The temporal information that user issues vehicle source information is obtained according to the user information in effective vehicle source information, grade setting is carried out to user according to the temporal information.The present invention sells vehicle information characteristics according to internet, in conjunction with the user related information of vehicle source information, by similarity calculation, carries out classification to data, can efficiently and accurately extract effective, good vehicle source information.

Description

A kind of vehicle source information processing method, system and equipment
Technical field
The invention belongs to Internet technical field, in particular to a kind of vehicle source information processing method, system and equipment.
Background technique
With the development of internet technology with it is perfect, vehicle trade market, especially second-hand automobile market also high-volume Steering internet business mode.For seller user by issuing vehicle source information in internet platform, buyer user is mutual by browsing The vehicle source information of networked platforms, selection purchase vehicle.Existing Second-hand Vehicle Transaction platform is various, vehicle source information data volume magnanimity, In a jumble, quality is different, and buyer and seller carry out friendship stranding based on the native data of magnanimity, to finding suitable trading object, reach Transaction is with big inconvenience.
Existing conventional method carries out manual identified, at high cost, low efficiency to magnanimity internet data;It can not be from most evidences Source carries out Comprehensive Correlation;It is incomplete to the identification of data characteristics.Lack machine automatically to sea in vehicle online transaction field The scheme that amount transaction data is classified.How to realize the efficiently and accurately classification of vehicle source information, obtains effectively good vehicle source Information is a technical problem to be solved urgently.
Summary of the invention
In view of the above-mentioned problems, the present invention provides a kind of vehicle source information processing methods, comprising:
According in the similarity of a plurality of vehicle source information, vehicle source information user related information, information of vehicles is to vehicle source information Classify, obtains effective vehicle source information and junk information;
The temporal information that user issues vehicle source information is obtained according to the user information in effective vehicle source information, according to institute It states temporal information and grade setting is carried out to user.
Further, according to user related information, the information of vehicles pair in the similarity of a plurality of vehicle source information, vehicle source information Vehicle source information is classified, and obtains effective vehicle source information and junk information includes:
To not there is no information of vehicles in vehicle source information or the vehicle source information of user identity information is not divided into the first rubbish Information, remaining information are divided into the first effective information;
Category division is carried out to the vehicle source information in the first effective information: the vehicle source information comprising car trader's information is divided into The vehicle source information for not including car trader's information is divided into first effective personal user information, institute by first effective car trader's information It states car trader's information and belongs to the user related information;
Similarity calculation is carried out to described first effective car trader's information, and is classified according to similarity calculation result;
Similarity calculation is carried out to described first effective personal user information, and is divided according to similarity calculation result Class.
Further, similarity calculation is carried out to described first effective personal user information, and according to similarity calculation knot Fruit carries out classification
To comprising same subscriber information and similar vehicle information and vehicle source information similarity reaches the Che Yuanxin of first threshold Breath is divided into second effective personal user information, and the user information belongs to the user related information.
Further, similarity calculation carried out to described first effective car trader's information, and according to similarity calculation result into Row is classified
To comprising similar users information and vehicle source information similarity reaches the vehicle source information of second threshold and is divided into second and has Car trader's information is imitated, the user information belongs to the user related information.
Further, the user information according in effective vehicle source information obtain user issue vehicle source information when Between information, according to the temporal information to user carry out grade setting include:
User's vehicle source information issue record of effective Che Yuanzhong, the hair are obtained in specified platform according to the specified period Cloth record includes issuing time;
According to the publication number of the vehicle source information issue record counting user publication vehicle source information repeatedly obtained and between the time Every according to the publication number and time interval setting user gradation.
It further, include to the individual in valid data according to the publication number and time interval setting user gradation User carries out grade setting:
The number that the vehicle source information that user gradation issues same vehicle with same user is arranged is positively correlated;
The time interval negative correlation that user gradation issues the vehicle source information of same vehicle with same user is set.
It further, include to the car trader in valid data according to the publication number and time interval setting user gradation Carry out grade setting:
User gradation is set and user issues the number positive correlation of vehicle source information;
User gradation is set and user issues the time interval negative correlation of vehicle source information.
Further, the method also includes:
The vehicle source information is obtained from multiple platforms, summarizes the vehicle source information record for unified format;
It is described to include: from multiple platforms acquisition vehicle source information
Recurrence accesses the access link of specified platform;
Search for the web page text under the link;
Obtain the vehicle source information in web page text within the scope of specified time.
Further, method further include:
The vehicle source information newly obtained is filtered according to the junk information.
Also a kind of vehicle source information processing system of the present invention, comprising:
Vehicle source information categorization module, for being believed according to user's correlation in the similarity of a plurality of vehicle source information, vehicle source information Breath, information of vehicles classify to vehicle source information, obtain effective vehicle source information and junk information;
Grade setup module issues vehicle source information for obtaining user according to the user information in effective vehicle source information Temporal information, according to the temporal information to user carry out grade setting.
Further,
The vehicle source information categorization module includes the first junk information screening unit, the first effective information taxon, phase Like degree computing unit, first effective car trader's information classifying unit, first effective personal user information taxon;
The first junk information screening unit is for will not have information of vehicles or no user identifier in vehicle source information The vehicle source information of information is divided into the first junk information, remaining information is divided into the first effective information;
The first effective information taxon is used to carry out category division to the vehicle source information in the first effective information: will Vehicle source information comprising car trader's information is divided into first effective car trader's information, and the vehicle source information for not including car trader's information is drawn It is divided into first effective personal user information, car trader's information belongs to the user related information;
The similarity calculated is used to carry out similarity calculation to described first effective car trader's information, and described first has Effect car trader's information classifying unit according to the similarity calculation result for classifying;
The similarity calculated is used to carry out similarity calculation to described first effective personal user information, and first has Effect personal user information taxon according to the similarity calculation result for classifying.
Further,
Similarity calculation is carried out to described first effective personal user information, and is classified according to similarity calculation result Include: to comprising same subscriber information and similar vehicle information and vehicle source information similarity reach first threshold vehicle source information draw It is divided into second effective personal user information, the user information belongs to the user related information;
And/or
Similarity calculation is carried out to described first effective car trader's information, and classification packet is carried out according to similarity calculation result It includes: to comprising similar users information and vehicle source information similarity reaches the vehicle source information of second threshold and is divided into second effective car trader Information, the user information belong to the user related information.
Further,
The grade setup module includes release information acquiring unit and user gradation setting unit:
The release information acquiring unit is used to obtain the use of effective Che Yuanzhong in specified platform according to the specified period Family vehicle source information issue record, the issue record includes issuing time;
The user gradation setting unit is used for according to the vehicle source information issue record counting user publication vehicle repeatedly obtained User gradation is arranged according to the publication number and time interval in the publication number and time interval of source information.
It further, include to the individual in valid data according to the publication number and time interval setting user gradation User carries out grade setting
The number that the vehicle source information that user gradation issues same vehicle with same user is arranged is positively correlated;And/or
The time interval negative correlation that user gradation issues the vehicle source information of same vehicle with same user is set;And/or
User gradation is set and user issues the number positive correlation of vehicle source information;And/or
User gradation is set and user issues the time interval negative correlation of vehicle source information.
The present invention also provides a kind of electronic equipment, including at least one processor and logical at least one described processor Believe the memory of connection, which is characterized in that
The memory is stored with the instruction that can be executed by least one described processor, and described instruction is by described at least one When a processor executes, at least one described processor is made to execute method described above.
Vehicle source information processing method, system and equipment of the invention:
Vehicle information characteristics are sold according to internet, in conjunction with the user related information of vehicle source information, pass through similarity calculation, logarithm According to classifying, efficiently, accuracy it is high;
By the publication number and issuing time interval stats of valid data user, grade classification is carried out to user, is industry Business platform targetedly promotes transaction setting to provide foundation.
Data source is extensive, is uniformly summarized to multi-platform vehicle source information, and intersection compares, improve data analysis can By property and reference value.
Other features and advantages of the present invention will be illustrated in the following description, also, partly becomes from specification It obtains it is clear that understand through the implementation of the invention.The objectives and other advantages of the invention can be by specification, right Pointed structure is achieved and obtained in claim and attached drawing.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is this hair Bright some embodiments for those of ordinary skill in the art without creative efforts, can be with root Other attached drawings are obtained according to these attached drawings.
Fig. 1 shows the step flow chart that vehicle source information according to an embodiment of the present invention is obtained, handles and applied;
Fig. 2 shows vehicle source information processing method key step flow charts according to an embodiment of the present invention;
Fig. 3 shows vehicle source information classification processing flow chart according to an embodiment of the present invention;
Fig. 4 shows vehicle source information processing system structural schematic diagram according to an embodiment of the present invention;
Fig. 5 shows vehicle source information processing detailed system structure schematic diagram according to an embodiment of the present invention;
Fig. 6 shows vehicle source information processing equipment structural schematic diagram according to an embodiment of the present invention.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical solution in the embodiment of the present invention clearly and completely illustrated, it is clear that described embodiment is A part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art Every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.
The present invention provides a kind of vehicle source information processing methods, and the embodiment of the present invention is by taking Second-hand Vehicle Transaction as an example, to second-hand Vehicle vehicle source information, that is, the acquisition-processing-for selling vehicle information are illustrated using overall process.But it is not limited to Second-hand Vehicle Transaction, such as new car Method disclosed in the embodiment of the present invention also can be used in transaction.As shown in Figure 1, the vehicle source information of the embodiment of the present invention obtain, Processing and application flow the following steps are included:
Step 1: obtaining vehicle source information from multiple platforms;
Step 2: summarizing vehicle source information;
Step 3: classifying to vehicle source information;
Step 4: setting vehicle source user grade;
Step 5: vehicle source information, which is handed over to rub with the hands, promotes setting.
The detailed process of above-mentioned steps is illustrated below.
Step 1: obtaining vehicle source information from multiple platforms.Searching algorithm is using the high-performance of computer come purposive poor The part or all of possible situation of a problem solution space is lifted, so as to find out a kind of method of the solution of problem.Web search is The text information of the webpage is included in all addressable links by accessing the homepage recurrence website of a website, from text envelope It is extracted in breath and sells vehicle information.In the present embodiment, by scanning in multiple specified platform websites, vehicle source information is obtained.Specifically Ground:
Recurrence accesses the access link of specified platform;The present embodiment middle finger fixed platform is multiple Second-hand Vehicle Transaction platforms, i.e., Multiple websites.Recurrence accesses specified platform linkage and refers to the all-links accessed on a webpage, and can beat each link The webpage opened similarly is operated: the all-links on the webpage are accessed, and so on, accessible entire website owns Link the page.
Web page text under search link;For the link of above-mentioned recurrence access, to the opened webpage of each link into Row text search, webpage are the synthesis containers comprising information such as picture, video, texts, are obtained to the text information in webpage It takes.
Obtain the vehicle source information in web page text within the scope of specified time.By scheduled keyword, web page text is extracted In vehicle source information, predetermined keyword such as " vehicle, car, car owner, user name ".Identify web page interlinkage and when vehicle source information, it can be with Text search is carried out by configuring regular expression corresponding with specified platform.
By there is platform targetedly text search, the acquisition magnanimity vehicle source information that can be automated, and it is available more The vehicle source information of platform improves the rich and data reliability of data.
Step 2: summarizing vehicle source information.The vehicle source information obtained by multi-platform site search is various, at random, general The vehicle source information record that vehicle source information summarizes for unified format, facilitates storage and subsequent analysis to classify.In the present embodiment, summarize for Unified format record refers to that vehicle source information record includes user information, information of vehicles, source-information and extra information field.Its Middle user information generally comprises the user identity informations such as user mobile phone number, name of firm in additional information comprising car trader user, Business license number etc., these information comprising user identity and mark belong to user related information in embodiments of the present invention.
Illustratively, vehicle source information record is saved using json data format, overall data structure includes four parts: Userinfo (user information), carinfo (information of vehicles), from (source-information, i.e. the platform source of this information), Otherinfo (additional information), such as:
Wherein, " " user_tel ": being 18510508304 " user's phone number, in embodiments of the present invention, marks for user Know information." " from_site ": " www.xin.com " " is source-information, indicates the data from " www.xin.com " Website." " Dealer_name ": " xin " " in additional information indicates car trader's title.""Dealder_no": 111111111111 " indicate business license number.By Webpage search, if getting car trader's title or business license number letter Breath, then it represents that the vehicle source information may be car trader's information.
The data record of above-mentioned json format can be stored in file or database, in a further embodiment, Information field can also be stored directly in Database field or in file.
In above-mentioned steps 1 and step 2, multi-platform information is integrated into unified format record, as classification and classification (grade Be arranged) operation data source;It in a further embodiment, can also be direct for single platform or single formatted data record Classify, does not then need to carry out the acquisition of step 1 and step 2 and summarize.The present embodiment is preferably using by multi-platform remittance Total data carry out analysis classification, and data source is extensive, and intersection comparison, the property of can refer to height are carried out in assorting process.
It is as shown in Figure 2 to the key step of vehicle source information classification, comprising:
According in the similarity of a plurality of vehicle source information, vehicle source information user related information, information of vehicles is to vehicle source information Classify, obtains effective vehicle source information and junk information;
The temporal information that user issues vehicle source information is obtained according to the user information in effective vehicle source information, according to institute It states temporal information and grade setting is carried out to user.
It elaborates below to the classification of vehicle source information and grade setting.
Step 3: classifying to vehicle source information.
In vehicle source information classifying step, record to be sorted is obtained from the record of preservation and carries out analysis classification.This reality Apply in example, classifying step be set as calculating daily it is primary, use the entry time of preservation within 3 days (calculate on the day of with And a few days ago) data as object of classification.Fig. 3 shows the classifying step process of the embodiment of the present invention, and classifying step is as follows:
(1) will not there is no information of vehicles in vehicle source information or the vehicle source information of user identity information is not divided into the first rubbish Rubbish information, remaining information are divided into the first effective information;In this implementation, user identity information is subscriber phone number, info class It can Hua Fen not be realized by setting class indication field.The garbage truck source information of partial invalidity is filtered out by critical field, The quality for improving vehicle source information totality is conducive to improve the computational efficiency and accuracy classified subsequently through similarity calculation.
(2) further classify to the first effective information in step (1): car trader's information will be present and (such as wrapped in additional information Value containing Dealer_name or Dealer_no value) vehicle source information be divided into first effective car trader's information;It will be without car trader's information Vehicle source information be divided into first effective personal user information.By vehicle source information by car trader's field information be divided into car trader's information and Personal user information, high-efficient simple are also beneficial to convenient for carrying out personalized service for different platform user groups for not Differentiation, subsequent sophisticated category operation with strong points are carried out with user's vehicle source information.
(3) there is personal user information further to be classified to first: treating classification data and carry out similarity calculation, according to Similarity calculation result and user information, information of vehicles treat classification data and classify.Specifically:
Data to be sorted are carried out record combination of two to compare.Calculate the corresponding field character between every two vehicle source information String similarity.Similarity refers to the difference ratio between two character strings, in the present embodiment, using Levenshtein algorithm into Line character string similarity calculation.The vehicle source information record for participating in comparing calculating for any two, to each field (such as user_ Tel, car_name etc.) similarity of character string comparison is carried out respectively, based on the similar of Levenshtein algorithm calculating character string Degree.100% is set by the identical similarity of character string, the entirely different similarity of character string is 0%.First calculate two The similarity of each field information between record, the similarity of vehicle title further pass through summation or weighted sum Mode obtains the similarity of multiple field combination fields, such as obtains vehicle letter according to vehicle title, vehicle age field combination The similarity of breath finally obtains the similarity of entire vehicle source information record according to all fields.The similarity of information of vehicles and complete It can be used for subsequent classification at the similarity of information.
According to similarity calculation as a result, (user information of the present embodiment) identical for cell-phone number, information of vehicles it is similar and The vehicle source information record that vehicle source information similarity reaches first threshold is divided into second effective personal user information, other Che Yuanxin Breath is divided into the second junk information.In the present embodiment, information of vehicles is similar to can according to need setting similarity threshold, is such as arranged Information of vehicles similarity reaches 80% and thinks that information of vehicles is similar;In addition, first threshold also can according to need setting, this reality It applies in example, first threshold 80%.
(4) further classified to first effective car trader's information: data to be sorted being subjected to record combination of two and are compared, Similarity of character string calculating is carried out using Levenshtein algorithm, calculation is similar with above-mentioned steps (3).Step (3), (4) It is specified without sequencing requirement.
According to similarity calculation as a result, reaching second threshold to similarity and the similar data of cell-phone number are divided into second has Car trader's information is imitated, other information is divided into the second junk information.In the present embodiment, second threshold 80%;Cell-phone number is similar to be Cell-phone number similarity reaches certain threshold value, and such as 90%.In practical application, adjustment similarity threshold can according to need.By right Car trader's information carries out the similar limitation of cell-phone number, can filter out part malice and brush single car trader's information, brush bicycle quotient information one As a plurality of vehicle source information record is issued using the different virtual mobile phones number automatically generated, while normal car trader may have multiple phases Like the work extension number of number, so retaining the effective car trader's information in this part by cell-phone number similar set up.
Step (3), (4), which combine, sells vehicle information characteristics, and for different user type, targetedly further extracting can The high effective vehicle source information of reliability, being capable of effective filtering fallacious, low value vehicle source information.
It is compared classification by similarity calculation, so that the process of mechanized classification is more careful accurate.Using from word The similarity calculation of section to record compares, and similarity calculation result utilization rate is high, participates in classification convenient for flexible choice as needed Field, field combination and its similarity threshold of calculating, to neatly be adjusted according to platform data variation and experience of classifying, is excellent Change the specific strategy of information classification.
Finally, using second effective personal user information and second effective car trader's information as effective vehicle source information.Effective vehicle Source information data can be integrated into the service center of effective result set distribution by program (interface), as operation system base early period Plinth data use.
Before being distributed to service center, the vehicle source information newly obtained is filtered according to the junk information.It will With labeled as in junk information, especially the second junk information record compares data record to be distributed, checks that record to be distributed is It is no to be already present in junk information.Vehicle source information for having been marked as junk information carries out exclusion filtering, for It is basic not as business datum labeled as the record of junk information, to be further reduced junk information, improves effective information and obtain The accuracy rate taken.
Step 4: setting vehicle source user grade.Number and issuing time interval setting user etc. are issued according to vehicle source information Grade.
(1) according to effective vehicle source information (second effective personal user information and second effective car trader's information) in step 3 In user information (i.e. validated user information) specified platform is scanned according to the specified period, obtain validated user information hair The record of cloth vehicle source information.In the present embodiment, all opening imformations that per half an hour meeting run-down specifies platform are set, are passed through Above-mentioned Webpage search mode obtains the vehicle source information newly issued in certain time, such as acquires the Che Yuanxin of nearest half an hour publication Breath, and obtain the time of record publication vehicle source information.It scans for obtaining according to effective Che Yuanxin, it is with strong points, avoid nothing Imitate the acquisition and processing of information.And period shorter real-time acquisition is carried out to effective vehicle source information, it can guarantee to obtain data tool There are better timeliness and utility value.
(2) in the publication history of acquisition, user gradation is arranged according to the number of user's publication and issuing time interval.Under Face, the mode that grade is arranged to second effective personal user and second effective car trader are described respectively:
The vehicle source information of second effective personal user's publication: setting user gradation issues the vehicle of same vehicle with same user The number of source information is positively correlated;The time interval negative that user gradation issues the vehicle source information of same vehicle with same user is set It closes.Being positively correlated indicates that user gradation is increased as number increases, and negative correlation indicates to reduce as time interval increases.
It searches and calculates number and time interval that user issues same information of vehicles, same user issues the vehicle of same vehicle Source information number is more, and user gradation is higher, and time interval is shorter, and user gradation is higher.That is, a user issues same vehicle Vehicle source information number it is more, interval it is shorter, this user gradation is higher, conversely, the fewer interval of the number for issuing same vehicle Longer, this user gradation is lower.Meanwhile if user repeatedly issues different vehicle information, user gradation is reduced.Practical application In, score calculating can be carried out to user gradation, be set as user gradation field according to the publication number and frequency of statistics.
The vehicle source information of second effective car trader's publication: searching and calculate number and interval that car trader issues vehicle source information, for The data of car trader's information publication, do not consider whether information of vehicles is identical.The car trader that number is more, interval is smaller is issued to car trader, It is higher that its user gradation is set.
By publication number and frequency (time interval) two indices, the user gradation of comprehensive descision vehicle source information, reliably Property it is high.Convenient for extracting the vehicle source information of high-quality user.
Step 5: vehicle source information, which is handed over to rub with the hands, promotes setting.Vehicle source information is classified by aforesaid operations and user information Grade setting.Based on classification data, hand in CAR SERVICE transaction platform and rub promotion setting with the hands.The present embodiment It hands over to rub with the hands and refers to that transaction is brought together, user gradation is higher, it is believed that it was issued sells that vehicle information is more reliable, it is strong to sell vehicle wish, by right It sells the strong vehicle source information of vehicle wish and user information provides better transaction platform and preferential display etc., transaction is promoted to reach.
By aforesaid operations, the success that vehicle can be promoted to trade, and realize that three win: selling automobile-used family can be quickly to close Reason price sells vehicle, and buying car user more easily has purchased the vehicle of high performance-price ratio, and transaction platform harvests public praise and realizes that order turns Change;Meanwhile for high-grade car trader, vehicle source has very big transaction potentiality, it is easier to reach cooperation with platform, to expand Large platform high quality car trader covering.
Based on identical inventive concept, the embodiment of the invention also provides a kind of vehicle source information processing systems, such as Fig. 4 institute Show, system includes: vehicle source information categorization module and grade setup module.
Vehicle source information categorization module, for being believed according to user's correlation in the similarity of a plurality of vehicle source information, vehicle source information Breath, information of vehicles classify to vehicle source information, obtain effective vehicle source information and junk information;
Grade setup module issues vehicle source information for obtaining user according to the user information in effective vehicle source information Temporal information, according to the temporal information to user carry out grade setting.
In the present embodiment, a plurality of vehicle source information is system by scanning for obtaining in specified platform, search process Middle recurrence accesses the access link of specified platform, obtains vehicle source information by text search, the specific method such as above method is implemented Described in example, and the vehicle source information for being integrated into specified format to multi-platform acquisition information records.
Below with reference to Fig. 5, system is constituted and functions are described further.It is as shown in Figure 5:
Vehicle source information categorization module includes the first junk information screening unit, the first effective information taxon, similarity Computing unit, first effective car trader's information classifying unit, first effective personal user information taxon;
The first junk information screening unit is for will not have information of vehicles or no user identifier in vehicle source information The vehicle source information of information is divided into the first junk information, remaining information is divided into the first effective information;In the present embodiment, Yong Huxin Breath generally comprises the user identity informations such as user mobile phone number, includes name of firm, the business license of car trader user in additional information Number etc., these information comprising user identity and mark belong to user related information in embodiments of the present invention.
First effective information taxon is used to carry out category division to the vehicle source information in the first effective information: will include The vehicle source information of car trader's information is divided into first effective car trader's information, and the vehicle source information for not including car trader's information is divided into First effective personal user information, car trader's information belong to the user related information;
Similarity calculated is used to carry out similarity calculation, first effective vehicle to described first effective car trader's information Quotient's information classifying unit for being classified according to the similarity calculation result, similarity calculation can according to above-described embodiment, Similarity calculation is carried out using each field of the Levenshtein algorithm to record, and is integrated into conjunction with field similarity result The similarity result of record.
The similarity calculated is used to carry out similarity calculation to described first effective personal user information, and first has Effect personal user information taxon according to the similarity calculation result for classifying.
Similarity calculation is carried out to described first effective personal user information, and is classified according to similarity calculation result It include: to comprising same subscriber information and similar vehicle information and vehicle source information similarity reaches the vehicle of first threshold (such as 80%) Source information is divided into second effective personal user information, other vehicle source information are divided into the second junk information;
Similarity calculation is carried out to described first effective car trader's information, and classification packet is carried out according to similarity calculation result It includes: to comprising similar users information (such as similarity reaches 90%) and vehicle source information similarity reaches second threshold (such as 80%) Vehicle source information is divided into second effective car trader's information, other vehicle source information are divided into the second junk information.
By carrying out step-by-step classifier to vehicle source information, the efficiency of classification is improved;By carrying out user class to vehicle source information Type classification, and corresponding subsequent classification operation is respectively adopted, so that classification is more intelligent, automatic classification results more accurately may be used It leans on, meet reality.
It is compared classification by similarity calculation, so that the process of mechanized classification is more careful accurate, and is used Similarity-rough set from field to record, similarity calculation result utilization rate is high, is convenient for flexible choice field as needed, field Combination and its similarity threshold, to be adjusted flexibly according to platform data variation and experience of classifying, Optimum Classification specific strategy.System Effective vehicle source information is distributed to service center by system, is used as system basic data early period.In the present embodiment, preferably by Two effective personal user informations and second effective car trader's information are distributed to service center, and before distribution, system is also to according to above-mentioned Second junk information is filtered the vehicle source information newly obtained, and the vehicle source information for having been marked as junk information is arranged It removes.
Grade setup module includes release information acquiring unit and user gradation setting unit:
The release information acquiring unit is used to obtain user's vehicle of effective Che Yuanzhong in specified platform according to the specified period Source information issue record, the issue record include issuing time;
The user gradation setting unit is used for according to the vehicle source information issue record counting user publication vehicle repeatedly obtained User gradation is arranged according to the publication number and time interval in the publication number and time interval of source information.
It include being carried out to the personal user in valid data according to the publication number and time interval setting user gradation Grade is arranged
The number that the vehicle source information that user gradation issues same vehicle with same user is arranged is positively correlated
The time interval negative correlation that user gradation issues the vehicle source information of same vehicle with same user is set;
User gradation is set and user issues the number positive correlation of vehicle source information;
User gradation is set and user issues the time interval negative correlation of vehicle source information.
It scans for periodically acquiring according to effective Che Yuanxin, it is with strong points, it can guarantee that obtaining data has preferably Timeliness and utility value.
By publication number and frequency (time interval) two indices, the user gradation of comprehensive descision vehicle source information, reliably Property it is high.Convenient for extracting the vehicle source information of high-quality user.
Based on identical inventive concept, the embodiment of the invention also provides a kind of electronic equipment, structure as shown in fig. 6, The electronic equipment includes:
At least one processor (processor) takes a processor as an example in Fig. 6, but is not only limited at one Manage device;It can also include communication interface (Communication Interface) and bus with memory (memory).Wherein, Processor, communication interface, memory can complete mutual communication by data connection.Communication interface can be used for information biography It is defeated.Processor can call the logical order in memory, to execute the vehicle source information processing method of above-described embodiment.
In addition, the logical order in above-mentioned memory can be realized and as independence by way of SFU software functional unit Product when selling or using, can store in a computer readable storage medium.
Memory can be used for storing software program, computer executable program as a kind of computer readable storage medium, Such as the corresponding program instruction/module of the method in the embodiment of the present invention.Processor is by running software stored in memory Program, instruction and module, thereby executing functional application and data processing, i.e. Che Yuanxin in realization above method embodiment Cease processing method.
Memory may include storing program area and storage data area, wherein storing program area can storage program area, at least Application program needed for one function;Storage data area, which can be stored, uses created data etc. according to terminal device.In addition, Memory may include high-speed random access memory, can also include nonvolatile memory.
Vehicle source information processing method, system and the equipment of offer described in the embodiment of the present invention can carry out vehicle source information Accurately and efficiently classification and user's classification, to implement that trade practices is promoted to provide data foundation for operation system.Meanwhile energy The data of multiple platforms are enough integrated, realize cross validation, it is further provided the authenticity and reliability for classification results of classifying.
Although the present invention is described in detail referring to the foregoing embodiments, those skilled in the art should manage Solution: it is still possible to modify the technical solutions described in the foregoing embodiments, or to part of technical characteristic into Row equivalent replacement;And these are modified or replaceed, various embodiments of the present invention technology that it does not separate the essence of the corresponding technical solution The spirit and scope of scheme.

Claims (15)

1. a kind of vehicle source information processing method characterized by comprising
Vehicle source information is carried out according to user related information, the information of vehicles in the similarity of a plurality of vehicle source information, vehicle source information Classification, obtains effective vehicle source information and junk information;
According in effective vehicle source information user information obtain user issue vehicle source information temporal information, according to it is described when Between information to user carry out grade setting.
2. the method according to claim 1, wherein according in the similarity of a plurality of vehicle source information, vehicle source information User related information, information of vehicles classify to vehicle source information, obtain effective vehicle source information and junk information include:
To not there is no information of vehicles in vehicle source information or the vehicle source information of user identity information be not divided into the first junk information, Remaining information is divided into the first effective information;
Category division is carried out to the vehicle source information in the first effective information: the vehicle source information comprising car trader's information is divided into first The vehicle source information for not including car trader's information is divided into first effective personal user information, the vehicle by effective car trader's information Quotient's information belongs to the user related information;
Similarity calculation is carried out to described first effective car trader's information, and is classified according to similarity calculation result;
Similarity calculation is carried out to described first effective personal user information, and is classified according to similarity calculation result.
3. according to the method described in claim 2, it is characterized in that, carrying out similarity to described first effective personal user information It calculates, and classification is carried out according to similarity calculation result and includes:
To comprising same subscriber information and similar vehicle information and vehicle source information similarity reach first threshold vehicle source information draw It is divided into second effective personal user information, the user information belongs to the user related information.
4. according to the method described in claim 2, it is characterized in that, carrying out similarity meter to described first effective car trader's information It calculates, and classification is carried out according to similarity calculation result and includes:
To comprising similar users information and vehicle source information similarity reaches the vehicle source information of second threshold and is divided into second effective vehicle Quotient's information, the user information belong to the user related information.
5. method according to claim 1 or 4, which is characterized in that the user according in effective vehicle source information Acquisition of information user issues the temporal information of vehicle source information, carries out grade setting to user according to the temporal information and includes:
User's vehicle source information issue record of effective Che Yuanzhong, the publication note are obtained in specified platform according to the specified period Record includes issuing time;
According to the publication number and time interval of the vehicle source information issue record counting user publication vehicle source information repeatedly obtained, root According to the publication number and time interval, user gradation is set.
6. according to the method described in claim 5, it is characterized in that, according to the publication number and time interval setting user etc. Grade includes carrying out grade setting to the personal user in valid data:
The number that the vehicle source information that user gradation issues same vehicle with same user is arranged is positively correlated;
The time interval negative correlation that user gradation issues the vehicle source information of same vehicle with same user is set.
7. according to the method described in claim 5, it is characterized in that, according to the publication number and time interval setting user etc. Grade includes carrying out grade setting to the car trader in valid data:
User gradation is set and user issues the number positive correlation of vehicle source information;
User gradation is set and user issues the time interval negative correlation of vehicle source information.
8. the method according to claim 1, wherein the method also includes:
The vehicle source information is obtained from multiple platforms, summarizes the vehicle source information record for unified format;
It is described to include: from multiple platforms acquisition vehicle source information
Recurrence accesses the access link of specified platform;
Search for the web page text under the link;
Obtain the vehicle source information in web page text within the scope of specified time.
9. method according to any of claims 1-4, which is characterized in that further include:
The vehicle source information newly obtained is filtered according to the junk information.
10. a kind of vehicle source information processing system characterized by comprising
Vehicle source information categorization module, for according to user related information, the vehicle in the similarity of a plurality of vehicle source information, vehicle source information Information classifies to vehicle source information, obtains effective vehicle source information and junk information;
Grade setup module, for according in effective vehicle source information user information obtain user issue vehicle source information when Between information, according to the temporal information to user carry out grade setting.
11. system according to claim 10, which is characterized in that
The vehicle source information categorization module includes the first junk information screening unit, the first effective information taxon, similarity Computing unit, first effective car trader's information classifying unit, first effective personal user information taxon;
The first junk information screening unit is for will not have information of vehicles or no user identity information in vehicle source information Vehicle source information be divided into the first junk information, remaining information is divided into the first effective information;
The first effective information taxon is used to carry out category division to the vehicle source information in the first effective information: will include The vehicle source information of car trader's information is divided into first effective car trader's information, and the vehicle source information for not including car trader's information is divided into First effective personal user information, car trader's information belong to the user related information;
The similarity calculated is used to carry out similarity calculation, first effective vehicle to described first effective car trader's information Quotient's information classifying unit according to the similarity calculation result for classifying;
The similarity calculated is used to carry out similarity calculation to described first effective personal user information, and first is effectively a People's user information taxon according to the similarity calculation result for classifying.
12. system according to claim 11, which is characterized in that
Similarity calculation is carried out to described first effective personal user information, and classification packet is carried out according to similarity calculation result Include: to comprising same subscriber information and similar vehicle information and vehicle source information similarity reach first threshold vehicle source information divide For second effective personal user information, the user information belongs to the user related information;
And/or
Similarity calculation is carried out to described first effective car trader's information, and carrying out classification according to similarity calculation result includes: pair Comprising similar users information and vehicle source information similarity reaches the vehicle source information of second threshold and is divided into second effective car trader's information, The user information belongs to the user related information.
13. system described in any one of 0-12 according to claim 1, which is characterized in that the grade setup module includes hair Cloth information acquisition unit and user gradation setting unit:
The release information acquiring unit is used to obtain user's vehicle of effective Che Yuanzhong in specified platform according to the specified period Source information issue record, the issue record include issuing time;
The user gradation setting unit is used for according to the vehicle source information issue record counting user publication Che Yuanxin repeatedly obtained User gradation is arranged according to the publication number and time interval in the publication number and time interval of breath.
14. system according to claim 13, which is characterized in that user is arranged according to the publication number and time interval Grade includes carrying out grade setting to the personal user in valid data to include:
The number that the vehicle source information that user gradation issues same vehicle with same user is arranged is positively correlated;And/or
The time interval negative correlation that user gradation issues the vehicle source information of same vehicle with same user is set;And/or
User gradation is set and user issues the number positive correlation of vehicle source information;And/or
User gradation is set and user issues the time interval negative correlation of vehicle source information.
15. a kind of electronic equipment, including at least one processor and the storage being connect at least one described processor communication Device, which is characterized in that
The memory is stored with the instruction that can be executed by least one described processor, and described instruction is by described at least one When managing device execution, at least one described processor perform claim is made to require method described in any one of 1-9.
CN201910544433.6A 2019-06-21 2019-06-21 A kind of vehicle source information processing method, system and equipment Pending CN110427963A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910544433.6A CN110427963A (en) 2019-06-21 2019-06-21 A kind of vehicle source information processing method, system and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910544433.6A CN110427963A (en) 2019-06-21 2019-06-21 A kind of vehicle source information processing method, system and equipment

Publications (1)

Publication Number Publication Date
CN110427963A true CN110427963A (en) 2019-11-08

Family

ID=68409367

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910544433.6A Pending CN110427963A (en) 2019-06-21 2019-06-21 A kind of vehicle source information processing method, system and equipment

Country Status (1)

Country Link
CN (1) CN110427963A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112686728A (en) * 2020-12-30 2021-04-20 上海瑞家信息技术有限公司 House resource information display method and device, electronic equipment and computer readable medium
CN113822544A (en) * 2021-08-31 2021-12-21 五八有限公司 Data processing method and system, electronic device and readable medium
CN114063871A (en) * 2021-10-22 2022-02-18 北京五八信息技术有限公司 Data processing method and device, electronic equipment and readable medium
CN115482014A (en) * 2022-09-15 2022-12-16 广东数鼎科技有限公司 Method and device for identifying false vehicle source of second-hand vehicle
CN115496440A (en) * 2022-09-15 2022-12-20 广东数鼎科技有限公司 Method and device for determining second-hand car inventory

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104133818A (en) * 2013-05-04 2014-11-05 白银博德信通科技有限公司 Automobile historical data analysis method and automobile historical data analysis system based on Internet of vehicles
CN104408125A (en) * 2014-11-26 2015-03-11 车智互联(北京)科技有限公司 Method and system for screening vehicles based on vehicle source brands, vehicle series and vehicle models
CN106022787A (en) * 2016-04-25 2016-10-12 王琳 People-vehicle multifactorial assessment method and system based on big data
CN106096044A (en) * 2016-06-28 2016-11-09 江苏车置宝信息科技股份有限公司 The Internet used car industry junk data recognition methods
CN106355420A (en) * 2016-08-30 2017-01-25 江苏车置宝信息科技股份有限公司 Customer data quality identification and automatic order distribution system
CN106485566A (en) * 2016-09-12 2017-03-08 北京易车互联信息技术有限公司 A kind of information recommendation method and device
WO2017073879A1 (en) * 2015-10-26 2017-05-04 비씨카드(주) Method and server for providing records of credit card sales of affiliated store

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104133818A (en) * 2013-05-04 2014-11-05 白银博德信通科技有限公司 Automobile historical data analysis method and automobile historical data analysis system based on Internet of vehicles
CN104408125A (en) * 2014-11-26 2015-03-11 车智互联(北京)科技有限公司 Method and system for screening vehicles based on vehicle source brands, vehicle series and vehicle models
WO2017073879A1 (en) * 2015-10-26 2017-05-04 비씨카드(주) Method and server for providing records of credit card sales of affiliated store
CN106022787A (en) * 2016-04-25 2016-10-12 王琳 People-vehicle multifactorial assessment method and system based on big data
CN106096044A (en) * 2016-06-28 2016-11-09 江苏车置宝信息科技股份有限公司 The Internet used car industry junk data recognition methods
CN106355420A (en) * 2016-08-30 2017-01-25 江苏车置宝信息科技股份有限公司 Customer data quality identification and automatic order distribution system
CN106485566A (en) * 2016-09-12 2017-03-08 北京易车互联信息技术有限公司 A kind of information recommendation method and device

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112686728A (en) * 2020-12-30 2021-04-20 上海瑞家信息技术有限公司 House resource information display method and device, electronic equipment and computer readable medium
CN112686728B (en) * 2020-12-30 2023-10-24 上海瑞家信息技术有限公司 House source information display method, device, electronic equipment and computer readable medium
CN113822544A (en) * 2021-08-31 2021-12-21 五八有限公司 Data processing method and system, electronic device and readable medium
CN113822544B (en) * 2021-08-31 2023-09-01 北京爱上车科技有限公司 Data processing method, system, electronic device and readable medium
CN114063871A (en) * 2021-10-22 2022-02-18 北京五八信息技术有限公司 Data processing method and device, electronic equipment and readable medium
CN115482014A (en) * 2022-09-15 2022-12-16 广东数鼎科技有限公司 Method and device for identifying false vehicle source of second-hand vehicle
CN115496440A (en) * 2022-09-15 2022-12-20 广东数鼎科技有限公司 Method and device for determining second-hand car inventory

Similar Documents

Publication Publication Date Title
CN110427963A (en) A kind of vehicle source information processing method, system and equipment
JP6379093B2 (en) Product identifier labeling and product navigation
CN108256052A (en) Automobile industry potential customers' recognition methods based on tri-training
US20110145234A1 (en) Search method and system
CN105095187A (en) Search intention identification method and device
CN109816482B (en) Knowledge graph construction method, device and equipment of e-commerce platform and storage medium
CN107818105A (en) The recommendation method and server of application program
KR101835333B1 (en) Method for providing face recognition service in order to find out aging point
CN109670852A (en) User classification method, device, terminal and storage medium
CN107016387A (en) A kind of method and device for recognizing label
CN109597858B (en) Merchant classification method and device and merchant recommendation method and device
CN106326441A (en) Information recommendation method and device
CN105469262B (en) Commodity information acquisition of information and integration method and device
CN104766062A (en) Face recognition system and register and recognition method based on lightweight class intelligent terminal
CN111915400B (en) Personalized clothing recommendation method and device based on deep learning
CN109816134A (en) Shipping address prediction technique, device and storage medium
CN109388743A (en) The determination method and apparatus of language model
CN111882403A (en) Financial service platform intelligent recommendation method based on user data
CN110390025A (en) Cover figure determines method, apparatus, equipment and computer readable storage medium
CN107609902A (en) The methods of exhibiting and device of a kind of targeted ads
CN107861970A (en) A kind of commodity picture searching method and device
CN107656918B (en) Obtain the method and device of target user
CN107833088A (en) Content providing, device and smart machine
CN106326318A (en) Search method and device
CN107861971A (en) A kind of product search method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20191108

RJ01 Rejection of invention patent application after publication