NL2035432A

NL2035432A - Computer-implemented method for identifying origin of goods by fusing truck trajectory and poi data

Info

Publication number: NL2035432A
Application number: NL2035432A
Authority: NL
Inventors: Liu Fangming; Yang Yanbo; Kuang Haibo; Sun Siyuan; Wang Zongyao; Jia Peng; Zhang Yuansheng
Original assignee: Univ Dalian Maritime
Priority date: 2022-07-21
Filing date: 2023-07-20
Publication date: 2024-01-29
Also published as: CN116029624A; CN116029624B

Abstract

Disclosed is a computer—implemented method for identifying the origin of goods by fusing truck trajectory and POI data, including: acquiring truck travel trajectory data; performing data preprocessing according to the truck travel trajectory data to obtain a feature dataset, and the feature dataset including the number of vehicles stopped over at a certain place, average speed of vehicles at a certain place, and average residence time of vehicles at a certain place; obtaining a vehicle passing point feature set by data analysis according to the feature dataset; screening according to the vehicle passing point feature set to obtain a cargo owner common feature set; and performing statistical analysis on the cargo owner common feature set to obtain reguired cargo owner geographical location information.

Description

P1847 /NL

COMPUTER-IMPLEMENTED METHOD FOR IDENTIFYING ORIGIN OF GOODS BY

FUSING TRUCK TRAJECTORY AND POI DATA

TECHNICAL FIELD

The present invention relates to the field of transportation big data and intelligent transportation, and more particularly to a computer-implemented method for identifying the origin of goods by fusing truck trajectory and POI data.

BACKGROUND ART

With the development of technology, onboard devices of trucks provide massive truck trajectory data resources based on big data services. The trajectory data records the longitude and latitude information, time information, travel speed, and other information about trucks. However, due to the low degree of information in the logistics industry at present, these data have not been fully ana- lyzed and utilized. There are few researches and applications on thetruck trajectory data in China, and the common research direc- tion is to predict and estimate the travel time and spatial dis- tribution of demand of passengersusing the vehicle trajectory da- ta. Trajectory data of public transportation is usually combined with swipe card data to identify travel features of passengers.

The application for identifying embarkation stations is mature at present; however, research on identifying travel endpoints of ve- hicles using vehicle trajectory data has been in its infancy. Tra- jJectory data of freight vehicles is analyzed to determine the freight vehicle travel path situations. The trajectory points are captured to the adjacent road section with topological relations using the GIS technology, the linear trajectory of vehicles along the road is generated using the shortest path algorithm, and fi- nally, the complete freight vehicle travel trajectory data is ob- tained. However, this method is still suitable for traffic surveys with a small scope and a small amount of data, and is not suitable for the current big data analysis. In addition, the freight travel endpoints obtained by this method have low accuracy.

Hold points of freight trucks are determined using the freight truck trajectory data; by combining the travel rules of freight trucks and locations of major logistics nodes, the truck travel endpoint data of the main logistics node areas of the city is obtained; and structural analysis is performed on the data to obtain the urban logistics space and time characteristics. Howev- er, the method is applicable to a small area and requires a known location of the main origin and destination of the freight trucks.

When the area through which the freight trucks pass increases, it is difficult to label all possible locations in the area, so it is difficult to apply the method to a wide range of freight truck travel endpoint identification.

In the related art, a method for extracting OD information using freight truck trajectory data is also included, where the parking points in the freight truck travel trajectory are identi- fied based on the determination of the parking speed thresholds of freight trucks, and finally, the freight OD points are extracted using a map matching method and clustered to analyze the develop- ment status of urban agglomeration. This method has certain limi- tations, and it fails to obtain the logistics-related information about the hold points, such as the number of arriving vehicles and residence time, leading to that only macroscopic analysis can be carried out on the results, and it is impossible to make a specif- ic analysis on logistics situation of the cargo owner in the place.

In some inventions and studies in China, although the identi- fication of vehicle hold points has been realized, the location information about real cargo owners cannot be identified. There are many types of hold points, such as gas stations, toll sta- tions, and service areas, which will interfere with the real cargo owner information and make the entire data analysis process more complex.

The research on truck trajectory data started earlier in oth- er countries, but there is no related and perfect research method at present. The known information about rest stations and gas sta- tions is used to identify the behaviors of rest, gas filling, and the like in the vehicle trajectory data. However, with the expan-

sion of the research area, the integrated data of the parking points corresponding to the parking events caused by other reasons becomes increasingly difficult to distinguish. A combination of three variables (the truck residence time, the distance to the road network, and the amount of course change) is currently used to identify the travel endpoints of freight trucks. In the trajec- tory data, a small course change indicates that the vehicle has taken a second stop, since this change is to facilitate the truck travel along the route to the destination. However, course change variables may be affected by road topology structures (e.g., curves), leading to false classifications, so this method has a problem of low accuracy in the practical application of identify- ing the travel endpoints of freight trucks.

In summary, with the development of technology, onboard de- vices of trucks provide massive truck trajectory data resources based on big data services. Due to the low degree of information in the logistics industry at present, these data have not been fully analyzed and utilized. The non-cargo owner geographic loca- tion data cannot be removed from the complicated and huge geo- graphic location data information, and the cargo owner geographic location data cannot be retained. There is a problem of low accu- racy in the practical application of identifying the travel end- points of freight trucks.

SUMMARY

The present invention provides a computer-implemented method for identifying the origin of goods by fusing truck trajectory and

POI data to overcome the problems that based on a large amount of truck trajectory data resources provided by big data services, the non-cargo owner geographic location data cannot be removed from the complicated and huge geographic location data information and the cargo owner geographic location data cannot be retained, and the accuracy is low in the practical application of identifying the travel endpoints of freight trucks.

In order to achieve the above object, the present invention adopts the following technical solutions.

A computer-implemented method for identifying the origin of goods by fusing truck trajectory and POI data includes: step 1: acquiring truck travel trajectory data, and the truck travel trajectory data at least including a vehicle license plate number, vehicle mileage, a longitude and latitude of a truck in traveling, truck travel time, and truck travel speed; step 2: performing data preprocessing according to the truck travel trajectory data to obtain a feature dataset, and the fea- ture dataset including the number of vehicles stopped over at a certain place, average speed of vehicles at a certain place, and average residence time of vehicles at a certain place; step 3: obtaining a vehicle passing point feature set by data analysis according to the feature dataset, and the vehicle passing point feature set including truck travel speed on an expressway section, truck travel speed on a town road section, truck travel speed on a rural road, a parking place where a truck is stopped to load and unload goods, a parking place where a truck driver eats and rests on the way, a truck refueling place, a parking place caused by traffic congestion or car accidents, a parking place caused by traffic charges on an urban road, and a parking place where a vehicle waits for a traffic light; step 4: screening the vehicle passing point feature set to obtain a cargo owner common feature set, and the cargo owner com- mon feature set including a location where truck detention time is greater than a preset time value, a location where the truck trav- el speed is less than a preset speed value, and a location where the number of passing trucks is greater than a preset number of vehicles; and step 5: performing statistical analysis on the cargo owner common feature set to obtain required cargo owner geographical lo- cation information and further obtain query information for iden- tifying the origin of goods.

Further, the truck travel trajectory data is obtained by an onboard positioning device for the truck and a GPS drive test.

Further, in step 2, the data preprocessing includes data structured processing, data screening processing, and data feature extraction, and specifically includes: step 2.1: performing data structured processing on the truck travel trajectory data, i.e., sorting and classifying the truck travel trajectory data to obtain initial truck travel trajectory data; step 2.2: performing data screening processing on the initial 5 truck travel trajectory data, i.e., removing fragmentary data, re- peating data, and noise based on the initial truck travel trajec- tory data to obtain optimized truck travel trajectory data; and step 2.3: performing data feature extraction on the optimized truck travel trajectory data, i.e., performing feature extraction based on the optimized truck travel trajectory data to obtain a feature dataset, and the feature dataset including the number of vehicles stopped over at a certain place, the average speed of ve- hicles at a certain place, and the average residence time of vehi- cles at a certain place.

Further, in step 3, the obtaining a vehicle passing point feature set by data analysis according to the feature dataset spe- cifically includes: step 3.1: performing dimension reduction on the feature da- taset by a dimension reduction method, i.e., reducing high- dimensional feature data into two-dimensional data using a princi- pal component analysis method based on the feature dataset; step 3.2: performing normalization on the two-dimensional da- ta, i.e., performing normalization based on the two-dimensional data to obtain optimized trajectory data, and the optimized tra-

Jectory data integrating features of the feature dataset; step 3.3: clustering the optimized trajectory data based on a

Gaussian mixture clustering method, assuming that the optimized trajectory data to be clustered is a mixture with multi Gaussian distribution and can be classified into k class clusters, and the class clusters being classified into necessary class clusters and supplementary class clusters, where the necessary class clusters includes high-speed travel of a truck on an expressway section, normal travel of a truck on a town road section, low-speed travel of a truck on a rural road, a parking place where a truck is stopped to load and unload goods, a parking place where a truck driver eats and rests on the way, a truck refueling place, a park- ing place caused by traffic congestion, a parking place caused by traffic charges on an urban road, and a parking place where a ve- hicle waits for a traffic light; the supplementary class clusters can be supplemented according to actual needs of a user, each class cluster sample obeys a Gaussian mixture cluster probability distribution model, and the Gaussian mixture cluster probability distribution model can be expressed as: k p= X B, 9 pu, C) iz] , where A is the probability generated for the th Gaussian probability distribution model, JB =1,andpg; = 0; k is to classi- fy the data into k classes; J is a $-dimensional feature parame- = EEE Co ter sample with a length of L, defined as y [> vil, fis a : p CC). 0 covariance matrix; H is a mean vector; and DA ) is the 7% multidimensional Gaussian distribution probability density func- tion: . 1 1 7 TI 17 5

P| 0) = ———exp| zis) Clr) (277 oy? * ‘ according to the above formula, the multi-dimensional Gaussi- an distribution probability density function is completely deter- mined by the covariance matrix and the mean vector, and the model parameter A of the Gaussian mixture model can be expressed as:

A={8.u.Cli=12 k

LA ij ; and step 3.4: performing visualization on the clustered class clusters based on the Gaussian mixture model parameter A, i.e., performing visualization on the clustered class clusters to obtain the vehicle passing point feature set, and the vehicle passing point feature set including truck travel speed on an expressway section, truck travel speed on a town road section, truck travel speed on a rural road, a parking place where a truck is stopped to load and unload goods, a parking place where a truck driver eats and rests on the way, a truck refueling place, a parking place caused by traffic congestion or car accidents, a parking place caused by traffic charges on an urban road, a parking place where a vehicle waits for a traffic light, and a supplementary category.

Furthermore, in step 4, the screening the vehicle passing point feature set to obtain a cargo owner common feature set spe- cifically includes: step 4.1: performing visualization on feature points of the number of vehicles stopped over at a certain place, the average speed of vehicles at a certain place, and the average residence time of vehicles at a certain place based on the vehicle passing point feature set to obtain common point visual diagrams of neces- sary class clusters and supplementary class clusters, and the com- mon point comprising truck detention time at a point, truck speed at a point, and the number of trucks at a point; and step 4.2: screening the common point visual diagrams to ob- tain a point visual diagram of truck transportation destinations based on principles conforming to the cargo owner common features; the principles conforming to the cargo owner common features com- prising a location where the truck detention time is greater than the preset time value, a location where the truck travel speed is less than the preset speed value, and a location where the number of passing trucks is greater than the preset number of vehicles; and obtaining the cargo owner common feature set containing geo- graphical location data based on the point visual diagram of truck transportation destinations.

Further, in step 5, the performing statistical analysis on the cargo owner common feature set to obtain required cargo owner geographical location information specifically includes: step 5.1: performing reverse coding on the cargo owner common feature set based on reverse geocoding to obtain POI information about all geographical locations, and the POI information being national map information obtained by performing keyword query through online information crawling and Amap API calling; step 5.2: parsing the national map information based on XML parsing to obtain and store POI information containing urban land attributes and road facility information; step 5.3: eliminating keywords according to statements in the urban land attributes and the road facility information in POI in-

formation to obtain the required cargo owner geographical location information, the eliminated keywords including scenic spots, resi- dential parks, and road sections, and the required cargo owner ge- ographical location information including large logistics parks, industrial parks, building material markets, airports, stations, and ports; and step 5.4: performing visualization on the required cargo own- er geographical location information on a map based on ArcGIS software to obtain point information, and the point information being the query information for identifying the origin of goods.

Beneficial effects of the present invention are as follows.

The present invention discloses a method for identifying the origin of goods by fusing truck trajectory and POI data based on big data analysis, including: acquiring truck travel trajectory data; performing data preprocessing according to the truck travel trajectory data to obtain a feature dataset, and the feature da- taset including the number of vehicles stopped over at a certain place, the average speed of vehicles at a certain place, and the average detention time of vehicles at a certain place; obtaining a vehicle passing point feature set by data analysis according to the feature dataset; screening the vehicle passing point feature set to obtain a cargo owner common feature set; and performing statistical analysis on the cargo owner common feature set to ob- tain the required cargo owner geographical location information.

The problems that based on a large amount of truck trajectory data resources provided by big data services, the non-cargo owner geo- graphic location data cannot be removed from the complicated and huge geographic location data information and the cargo owner geo- graphic location data cannot be retained, and the accuracy is low in the practical application of identifying the travel endpoints of freight trucks are solved to facilitate enterprises to better carry out follow-up services and management.

BRIEF DESCRIPTION OF THE DRAWINGS

In order that embodiments of the present invention or tech- nical solutions in the related art to be described more clearly, the drawings required to be used in embodiments or the related art are briefly described below. The drawings in the following de- scription are only some embodiments of the present invention.

Those skilled in the art may obtain other drawings according to these drawings without involving any inventive effort.

FIG. 1 is a flowchart of a method for identifying the origin of goods by fusing truck trajectory and POI data according to the present invention.

FIG. 2 is a diagram of truck travel trajectory data of a method for identifying the origin of goods by fusing truck trajec- tory and POI data according to the present invention.

FIG. 3 is a visualized point diagram of a vehicle passing point feature set of a method for identifying the origin of goods by fusing truck trajectory and POI data according to the present invention.

FIG. 4 is a comparison diagram between a common point visual diagram and a point visual diagram of truck transportation desti- nations of a method for identifying the origin of goods by fusing truck trajectory and POI data according to the present invention.

FIG. 5 is a diagram of POI information of a method for iden- tifying the origin of goods by fusing truck trajectory and POI da- ta according to the present invention.

FIG. 6 is a diagram visualized by ArcGIS software of a method for identifying the origin of goods by fusing truck trajectory and

POI data according to the present invention.

DETAILED DESCRIPTION OF THE EMBODIMENTS

In order that the object, technical solutions, and advantages of embodiments of the present invention to be clearer, the tech- nical solutions of embodiments of the present invention will now be described clearly and completely below with reference to the drawings in embodiments of the present invention. Embodiments de- scribed herein are only part of, but not all, embodiments of the present invention. Based on embodiments in the present invention, all the other embodiments obtained by those skilled in the art without making any inventive effort fall within the scope of pro- tection of the present invention.

A computer-implemented method for identifying the origin of goods by fusing truck trajectory and POI data is provided in this embodiment, as shown in FIG. 1, including the following steps.

In step 1, truck travel trajectory data is acquired, and the truck travel trajectory data at least includes a vehicle license plate number, vehicle mileage, a longitude and latitude of a truck in traveling, truck travel time, and truck travel speed. As shown in FIG. 2, the truck travel trajectory data is acquired by an onboard positioning device for the truck and a GPS drive test, and accurately measured travel trajectory data in real time can be ac- guired.

In step 2, data preprocessing is performed according to the truck travel trajectory data to obtain a feature dataset, and the feature dataset includes the number of vehicles stopped over at a certain place, average speed of vehicles at a certain place, and average residence time of vehicles at a certain place;

In step 3, a vehicle passing point feature set is obtained by data analysis according to the feature dataset, and the vehicle passing point feature set includes truck travel speed on an ex- pressway section, truck travel speed on a town road section, truck travel speed on a rural road, a parking place where a truck is stopped to load and unload goods, a parking place where a truck driver eats and rests on the way, a truck refueling place, a park- ing place caused by traffic congestion or car accidents, a parking place caused by traffic charges on an urban road, and a parking place where a vehicle waits for a traffic light.

In step 4, the vehicle passing point feature set is screened to obtain a cargo owner common feature set, and the cargo owner common feature set includes a location where truck detention time is greater than a preset time value, a location where the truck travel speed is less than a preset speed value, and a location where the number of passing trucks is greater than a preset number of vehicles.

In step 5, statistical analysis is performed on the cargo owner common feature set to obtain required cargo owner geograph- ical location information and further obtain query information for identifying the origin of goods.

Through the above steps, facing the massive and complex changes in truck trajectory data, based on big data and truck tra- jJectory data, the limitations of existing technical methods are overcome and the gaps in existing technical methods are filled.

The travel rules of trucks are discovered from the data, and truck parking points are identified based on these rules. Non-cargo own- er parking points are excluded from numerous parking points, and the customer groups for freight transportation are ultimately identified to find key cargo owners, providing relatively reliable data information for subsequent analysis. Moreover, the method of the present invention does not require a training dataset and is not limited by data type or guantity, making it suitable for ana- lyzing massive and complex truck trajectory data.

In a particular embodiment, in step 2, the data preprocessing includes data structured processing, data screening processing, and data feature extraction, and specifically includes the follow- ing steps.

In step 2.1, data structured processing is performed on the truck travel trajectory data, i.e., the truck travel trajectory data is sorted and classified to obtain initial truck travel tra-

Jjectory data.

In step 2.2, data screening processing is performed on the initial truck travel trajectory data, i.e., fragmentary data, re- peating data, and noise are removed based on the initial truck travel trajectory data to obtain optimized truck travel trajectory data.

In step 2.3, data feature extraction is performed on the op- timized truck travel trajectory data, i.e., feature extraction is performed based on the optimized truck travel trajectory data to obtain a feature dataset, and the feature dataset includes the number of vehicles stopped over at a certain place, average speed of vehicles at a certain place, and average residence time of ve- hicles at a certain place.

The number of vehicles stopped over at a certain place is Xi:

X=> Xx, /

In the formula, Ya represents a vehicle occurring at the time fÍ at the place J,

The average speed of vehicles at a certain point is Vi. — 12

Piz 3 Vi

Hel .

In the formula, Via represents a vehicle speed value acquired at the time ! at the place J

The average residence time of vehicles at a certain point is 7,

Xx, * > m i 10 2m,

X,

In the formula, mn, represents the frequency of occurrence of the 7 vehicle at the place J,

Since the truck trajectory data contains a lot of dimensions and attributes, it is extremely difficult to cluster data cbjects in a high-dimensional space. In this method, the dimension reduc- tion and normalization are performed on the trajectory data, i.e., firstly dimension reduction is performed on the data using the di- mension reduction technology, and normalization is performed to unify the data dimension, thereby greatly reducing the calculation amount and improving the efficiency of data processing and opera- tion. By combining the big freight data and common features of the origin of goods, three typical features of the feature dataset are extracted, and the truck trajectory data is quantified according to the three features to facilitate the subsequent clustering, thereby improving the identification accuracy and efficiency, and filling the gap that there is currently no relevant information about cargo owners obtained based on the truck trajectory data.

In a specific embodiment, in step 3, that a vehicle passing point feature set is obtained by data analysis according to the feature dataset specifically includes the following steps.

In step 3.1, dimension reduction is performed on the feature dataset by a dimension reduction method, i.e., high-dimensional feature data is reduced into two-dimensional data which facili- tates data analysis using a principal component analysis method based on the feature dataset.

In step 3.2, normalization is performed on the two- dimensional data, i.e., normalization is performed based on the two-dimensional data to obtain optimized trajectory data, and the optimized trajectory data integrates features of the feature da- taset.

In step 3.3, the optimized trajectory data is clustered based on a Gaussian mixture clustering method. It is assumed that the optimized trajectory data to be clustered is a mixture with multi

Gaussian distribution and can be classified into k class clusters, and the class clusters are classified into necessary class clus- ters and supplementary class clusters. The necessary class clus- ters include high-speed travel of a truck on an expressway sec- tion, normal travel of a truck on a town road section, low-speed travel of a truck on a rural road, a parking place where a truck is stopped to load and unload goods, a parking place where a truck driver eats and rests on the way, a truck refueling place, a park- ing place caused by traffic congestion, a parking place caused by traffic charges on an urban road, and a parking place where a ve- hicle waits for a traffic light. The supplementary class clusters can be supplemented according to actual needs of a user, each class cluster sample obeys a Gaussian mixture cluster probability distribution model, and the Gaussian mixture cluster probability distribution model can be expressed as:

A

Pp Y= B pl C) i=l ’ where È is the probability generated for the I* Gaussian probability distribution model, SK, fB; =1,andf; 2 0; k is to classi- fy the data into k classes; J is a $-dimensional feature parame- ter sample with a length of L, defined as vrl yd, C isa covariance matrix; 4 is a mean vector; and pO}, C) is the !% multidimensional Gaussian distribution probability density func-

According to the above formula, the multidimensional Gaussian distribution prebability density function is completely determined by the covariance matrix and the mean vector, and the model param- eter A of the Gaussian mixture model can be expressed as:

A={ 8 4.Cli=12 k

In step 3.4, as shown in FIG. 3, visualization is performed on the clustered class clusters based on the Gaussian mixture mod- el parameter a, i.e., visualization is performed on the clustered class clusters to obtain the vehicle passing point feature set, and the vehicle passing point feature set includes truck travel speed on an expressway section, truck travel speed on a town road section, truck travel speed on a rural road, a parking place where a truck is stopped to load and unload goods, a parking place where a truck driver eats and rests on the way, a truck refueling place, a parking place caused by traffic congestion or car accidents, a parking place caused by traffic charges on an urban road, a park- ing place where a vehicle waits for a traffic light, and a supple- mentary category.

In a particular embodiment, in step 4, that the vehicle pass- ing point feature set is screened to obtain a cargo owner common feature set specifically includes the following steps. step 4.1: visualization is performed on feature points of the number of vehicles stopped over at a certain place, the average speed of vehicles at a certain place, and the average residence time of vehicles at a certain place based on the vehicle passing point feature set to obtain common point visual diagrams of neces- sary class clusters and supplementary class clusters, and the com- mon point includes truck detention time at a point, truck speed at a point, and the number of trucks at a point.

In step 4.2, the common point visual diagrams are screened to obtain a point visual diagram of truck transportation destinations based on principles conforming to the cargo owner common features,

as shown in FIG. 4. The principles conforming to the cargo owner common features include a location where the truck detention time is greater than the preset time value, a location where the truck travel speed is less than the preset speed value, and a location where the number of passing trucks is greater than the preset num- ber of vehicles. The cargo owner common feature set containing ge- ographical location data is obtained based on the point visual di- agram of truck transportation destinations.

In a specific embodiment, in step 5, that statistical analy- sis is performed on the cargo owner comnon feature set to obtain required cargo owner geographical location information specifical- ly includes the following steps.

In step 5.1, reverse coding is performed on the cargo owner common feature set based on reverse geocoding to obtain POI infor- mation about all geographical locations. As shown in FIG. 5, the

POI information is national map information obtained by performing keyword query through online information crawling and Amap API calling.

In step 5.2, the national map information is parsed based on

XML parsing to obtain and store the POI information containing ur- ban land attributes and road facility information.

In step 5.3, keywords are eliminated according to statements in the urban land attributes and the road facility information in the POI information to obtain the required cargo owner geograph- ical location information. The eliminated keywords include scenic spots, residential parks, and road sections. The required cargo owner geographical location information includes large logistics parks, industrial parks, building material markets, airports, sta- tions, and ports.

In step 5.4, as shown in FIG. 6, visualization is performed on the required cargo owner geographical location information on a map based on ArcGIS software to obtain point information, and the point information is query information for identifying the origin of goods.

After the cargo owner common feature set is obtained by screening the vehicle passing point feature set, the reserved class clusters with the cargo owner common features all contain a large amount of geographical location data, and these geographical locations contain cargo owner geographical information and non- cargo owner geographical information; therefore, reverse geocoding is required to be performed on the data in the reserved class clusters to obtain POI information about all the geographical lo- cations. The POI information requires the acquisition of the na- tional map information obtained by performing keyword query through online information crawling and Amap API calling. In the present invention, information such as road networks, toll sta- tions, service areas, and highway entrances and exits across the country is crawled to extract the POI information. In addition, the location attribute of each location is obtained through XML analysis, the finally obtained POI information includes urban land attributes and road facility information, and the obtained POI in- formation is stored. Finally, keywords are required to be elimi- nated according to statements in the urban land attributes and the road facility information in the POI information; non-loading points are eliminated, such as scenic spots, residential parks, and road sections; all the non-cargo owner geographical locations are eliminated to obtain the final cargo owner information re- sults; and visualization is performed on the results in the map using the ArcGIS software to facilitate the identification and query of the origin of goods.

Finally, it should be noted that: the above embodiments are merely illustrative of the technical solutions of the present in- vention, and are not limiting thereto. Although the present inven- tion has been described in detail with reference to the above em- bodiments, those skilled in the art will appreciate that the tech- nical solutions disclosed in the above embodiments can still be modified or some or all of the technical features can be replaced by equivalents. Such modifications and substitutions do not depart from the scope of embodiments of the present invention in any way.

Claims

CONCLUSIONS

1. Computer-implemented method for identifying the origin of goods by merging route and POI data of trucks, comprising: step 1: acquiring route data of a truck, wherein the route data of a truck includes at least a vehicle registration number, include vehicle mileage, a longitude and latitude of a truck while in motion, the truck's driving time, and the truck's traveling speed; Step 2: Performing data preprocessing based on the truck trajectory data to obtain a feature dataset, and the feature dataset includes the number of vehicles stopped at a certain place, the average speed of vehicles at a certain place , and the average residence time of vehicles at a given location; step 3: obtaining a feature set of a vehicle passing point by data analysis according to the feature dataset, and the feature set of a vehicle passing point including the driving speed of trucks on a highway section, the driving speed of trucks on a city road section, the driving speed of trucks on a rural stretch of road, a parking lot where a truck is stopped to load and unload goods, a parking lot where a truck driver eats and rests along the way, a refueling place for trucks, a parking lot caused by traffic congestion or car accidents, a parking space caused by traffic charges on an urban road, and a parking space where a vehicle waits at a traffic light; Step 4: Screening the feature set of a vehicle passing point to obtain a common feature set of the freight owner, and the common feature set of the freight owner includes a location where the truck detention time is greater than a preset time value, a location where the driving speed of the truck is lower than a preset speed value, and a location where the number of passing trucks is greater than the preset number of vehicles; and step 5: performing statistical analysis on the common attribute set of the freight owner to obtain the required geographic location information of the freight owner and further obtain search information for identifying the origin of goods.

A computer-implemented method for identifying the origin of goods by merging truck route and POI data according to claim 1, wherein the truck journey data is obtained by a positioning device on board the truck and a GPS ride test.

A computer-implemented method for identifying the origin of goods by merging route and POI data from trucks according to claim 1, wherein in step 2 the data pre-processing includes structured data processing, data filtering and extraction of data features to - summarizes, and specifically includes: step 2.1: performing structured data processing on the truck trajectory data, i.e. sorting and classifying the truck trajectory data to obtain initial truck trajectory data; step 2.2: performing data filtering on the initial truck trajectory data, i.e. removing fragmentary data, repetitive data and noise based on the initial truck trajectory data to obtain optimized truck trajectory data; and step 2.3: performing data feature extraction on the optimized truck journey data, i.e., performing feature extraction based on the optimized truck journey data to obtain a feature data set, and the feature data set includes the number vehicles stopped at a specific location, the average speed of vehicles at a specific location, and the average residence time of vehicles at a specific location.

A computer-implemented method for identifying the origin of goods by merging route and POI data of trucks according to claim 1, wherein in step 3, obtaining a feature set of a vehicle passing point by data analysis according to the feature dataset, specifically includes:

step 3.1: performing dimension reduction on the feature dataset using a dimension reduction method, i.e. reducing high-dimensional feature data to two-dimensional data using a principal component analysis method based on the feature dataset;

step 3.2: performing normalization on the two-dimensional data, that is, performing normalization on the two-dimensional data to obtain optimized trajectory data, and the optimized trajectory data retains the characteristics of the feature dataset;

step 3.3: clustering the optimized trajectory data based on a Gaussian mixture clustering method, assuming that the optimized trajectory data to be clustered is a mixture with a multi-Gaussian distribution and can be classified into clusters of k-classes, and that the class clusters are classified into the necessary class clusters and additional class clusters, where the necessary class clusters include the high speed driving of a truck on a highway section, the normal driving of a truck on an urban highway section, the low speed driving of a truck on a rural road, a parking lot where a truck is stopped to load and unload goods, a parking lot where a truck driver eats and rests along the way, a refueling place for trucks, a parking lot caused by traffic congestion and the like, a parking lot caused by far-

traffic charges on an urban road, and a parking lot where a vehicle waits for a traffic light; the additional class clusters can be supplemented according to a user's actual needs, each class cluster sample follows a Gaussian mixture cluster probability distribution model, and the Gaussian mixture cluster probability distribution model can be expressed as:

x a p (y= B PWC) i= , where? is the probability generated for the is Gaussian probability distribution model, VEB =1andg 20; k is to classify the data into k classes; + is an S-dimensional feature parameter sample of length L, defi- ‚=ly = 7 , , A nited as } [> Yi], 6 is a covariance matrix; H is a , PGC) your average vector; and 1s the multidimensional Gaussian distribution probability density function: 1 1 7 Pete \ P|, C) = Terp) zis) Civ 4) (zat lof? * 4 according to the above formula, the multidimensional Gaussian distribution probability density function is completely determined by the co-variance matrix and the mean vector, and the model parameter A of the Gaussian mixture model can be expressed as: A= 3, UD iz L2 tk i A A ij : and step 3.4: performing visualization on the clustered class clusters based on the Gaussian mixture model parameter A, that is, performing visualization on the clustered class clusters to obtain the feature set of the vehicle passing point, and the feature set of the vehicle passing point includes the driving speed of trucks on a highway section, driving speed of a truck on a city road section, driving speed of a truck on a rural road, a parking lot where a truck is stopped to load and unload goods, a parking lot where a truck driver eats and rests on the road, a parking lot where a truck refuels, a parking lot caused by traffic jams or car accidents, a parking space caused by traffic charges on a city road, a parking space where a vehicle waits for a traffic light, and an additional category.

A computer-implemented method for identifying the origin of goods by merging route and POI data of trucks according to claim 1, wherein in step 4 screening the vehicle passing point characteristic set to obtain a common characteristic set of the freight owner - get specific includes: step 4.1: perform visualization at feature points of the number of vehicles stopped at a certain place, the average speed of vehicles at a certain place and the average dwell time of vehicles at a certain place based on the vehicle passing point function which is set to obtain common visual diagrams of necessary class clusters and additional class clusters, and the common point includes the detention time of trucks at a point, the speed of trucks at a point and the number of trucks at a point; and step 4.2: screening of the common visual point diagrams to obtain a visual point diagram of the truck transport destinations based on principles corresponding to the common characteristics of the cargo owner; the principles corresponding to the common characteristics of the cargo owner include a location where the truck's detention time is greater than the preset time value, a location where the truck's travel speed is lower than the preset speed value, and a location where it number of passing trucks is greater than the preset number of vehicles; and obtaining the freight owner's common feature set containing geographic location data based on the visual point diagram of truck transportation destinations.

A computer-implemented method for identifying the origin of goods by merging route and POI data of trucks according to claim 1, wherein in step 5 performing statistical analyzes on the common characteristic set of the freight owner to determine the required geographical location - obtain cation information from the freight owner specifically for

summary: step 5.1: performing reverse coding on the freight owner's common attribute set based on reverse geocoding to obtain POI information about all geographical locations, where the POI information is national map information obtained by performing keyword queries via online information search and Amap API calls; step 5.2: parsing the national map information based on XML parsing to obtain and store the POI information containing urban land attributes and road facility information; step 5.3: eliminating keywords based on statements in the urban land features and the road facility information in the POI information to obtain the required geographical location information of the freight owner, the eliminated keywords include scenic spots, residential parks and road sections, and the required freight geographic location information of the freight owner includes major logistics parks, industrial parks, building materials buildings, airports, stations and ports; and step 5.4: perform visualization on the required geographical location information of the cargo owner on a map based on ArcGIS software to obtain point information, where the point information is the query information for identifying the origin of goods.