CN111405585A

CN111405585A - Neighbor relation prediction method based on convolutional neural network

Info

Publication number: CN111405585A
Application number: CN202010194728.8A
Authority: CN
Inventors: 骆曦; 李克; 刘子巍
Original assignee: Beijing Union University
Current assignee: Beijing Union University
Priority date: 2020-03-19
Filing date: 2020-03-19
Publication date: 2020-07-10
Anticipated expiration: 2040-03-19
Also published as: CN111405585B

Abstract

The invention discloses a method for predicting a neighbor relation based on a convolutional neural network, and belongs to the technical field of base station information management. The method comprises the steps of data preprocessing, CNN network construction, CNN model training, neighbor relation prediction and post-processing; the wireless knowledge graph related to the invention belongs to non-Euclidean domain data, graph structure information in the graph, namely attributes and relations of all entities are converted into characteristics which can be processed by CNN, and implicit relation characteristics between a plurality of sampling point pairs and cells are extracted by adopting convolution kernels with various heights. The method can be used as a supplement or even a substitute for the existing manual drive test and ANR technology, guides the network operation and maintenance department of an operator to more efficiently, timely and conveniently configure and manage the adjacent cell relation of the base station in the wireless network, and provides powerful support for improving the switching success rate of a user between cells in the network, improving service continuity and guaranteeing good service experience.

Description

Neighbor relation prediction method based on convolutional neural network

Technical Field

The invention relates to a method for predicting a neighbor relation based on a convolutional neural network, and belongs to the technical field of base station information management.

Background

In the mobile communication field, a wireless network knowledge graph is more and more concerned as a new base station information management method, and the intelligent level of mobile network operation and maintenance and the efficiency of operation and maintenance work are improved based on the outstanding advantages of knowledge reasoning in the aspect of information mining.

The main purpose is to ensure the communication quality and the performance of the whole network, only after the base station (eNB in a 4G L TE system and gNB in a 5G system) configures the adjacent cell relation of each cell, the user enters another cell from one cell to complete the handover.

If the number of the neighboring cells included in the neighboring cell relation list is too large, the complexity and redundancy of system networking can be increased, and the accuracy of the measurement report is reduced; if the number of the neighboring cells included in the neighboring cell relation list is too small or the configuration is wrong, coverage missing or blind areas exist among the cells, the switching success rate is low, and a large number of call drops are caused. In the network operation process, as the network scale, user distribution and interference environment change continuously, the neighbor relation needs to be checked and optimized periodically to ensure the network performance.

The prior art methods mainly have two types:

(1) manual drive test: traditional network planning and optimization is a complex task, and requires a professional to carry equipment regularly for drive test analysis. However, the drive test needs to consume a large amount of time, manpower and material resources, and has the problems of long period, high cost and the like. From the operator's perspective, lower networking and operating costs are required in order to provide lower-priced network services to the users to gain the market.

(2) ANR (automatic neighbor relation): the method is a method for automatically completing the configuration and optimization of the neighboring cell relation, belongs to an important component in the self-organizing network (SON) technology, and has high automation degree. However, the existing network generally does not rely on the function completely, and the new network is usually closed after being opened for a period of time, and then further optimized manually. The main reasons include: (a) the ANR does not consider the actual coverage capability of the base stations, and does not fully utilize or consider the coverage quality information of each base station, which may result in too many configured neighboring cells and bring too many unnecessary neighboring cell relations (actually redundant or ultra-far neighboring cell relations); (b) a large amount of burden of terminal measurement and air interface signaling interaction can be brought; (c) the configuration of the strategy parameters is not uniform in the whole network, and the problems of repeated addition and deletion of adjacent cells, unstable number of the adjacent cells and the like are caused by improper configuration. Some improvements have also been proposed, for example, when the mobile terminal detects that the RSRP of the cells out of the list is higher than the RSRP of the current serving cell, and the difference is greater than this value, the mobile terminal will report the measurement result to the serving eNB. The ANR function includes three modules: the system comprises a neighbor cell detection module, a neighbor cell deletion module and a neighbor cell relation list management module. The main working steps comprise:

(a) the eNB issues measurement configuration related to ANR (automatic neighbor identity) to the UE, wherein the measurement configuration related to ANR can comprise Intra-RAT (Intra-RAT) co-frequency measurement, Inter-frequency measurement or Inter-RAT (Inter-RAT) measurement, the UE executes PCI (physical cell identity) measurement after receiving the measurement configuration, and reports the measured PCI information of the neighboring cell to the eNB according to the format of a measurement report;

(b) after receiving the PCI information of the neighboring cell, the eNB selects a specific UE to issue a report CGI (cell global identity) and a measurement configuration, and after receiving the measurement configuration, the UE reads the broadcast information of the neighboring cell and obtains information such as the CGI of the neighboring cell;

(c) after receiving information such as the CGI of the neighboring cell reported by the UE, the eNB reports the information to an O & M (operation and maintenance) system, and the O & M makes a decision on whether to add the neighboring cell.

Note that: the eNB does not care which mobile terminals report the measurement results, but what cells are reported to the eNB, counts the number of times that each cell is reported, and only adds the cells of which the number of times of reporting exceeds the threshold value to the neighbor relation list according to a preset threshold value.

The main drawbacks of the prior art solutions are:

(1) manual drive test: and a large amount of time, manpower and material resources are consumed, and the road test cannot traverse all coverage areas and service time and has no statistics of the whole network, so that the guiding significance of the neighbor optimization is limited to a certain extent.

(2) ANR (automatic neighbor relation): because the actual coverage capability of the base station is not considered, too many configured adjacent cells are generally caused, a large number of redundant adjacent cell relations occur, or due to an ultra-far adjacent cell relation caused by the coverage of a cross-region, a large number of terminal measurement and air interface signaling interaction burdens are caused, or frequent switching is caused to influence the continuity of a service, even a call is dropped, and in addition, the configuration of the strategy parameters is not uniform in the whole network, and the problems of repeated addition and deletion of the adjacent cells, unstable number of the adjacent cells and the like are caused due to the unreasonable configuration. Therefore, at present, the function is not completely relied on in the 4G existing network, and in general, the ANR function of the base station is only turned on for a period of time at the initial stage of the newly-built network, and then turned off and optimized by means of manual drive test and the like.

Therefore, a new technical means is required to be found for accurately finding and calibrating the neighboring cell relation between cell entities.

Disclosure of Invention

The problem to be solved by the invention is how to automatically extract the neighbor relation among cell entities from massive terminal sensing coverage data, base station measurement report data and preliminarily constructed wireless network knowledge map data, correct mismatched or redundant neighbor cells in a current neighbor relation list and supplement the mismatched neighbor relation, thereby realizing more efficient and intelligent neighbor relation management.

In order to solve the technical problems, the technical scheme adopted by the invention is a method for predicting the neighbor relation based on a convolutional neural network, and the method comprises the following specific steps:

step 1: data pre-processing

(1) Constructing an initial training sample set S', namely selecting a certain manually marked local area of NC L, traversing NC L under the area, adding a positive sample into the training set by using all pairwise cell pairs with a marking field of 1, namely determining that the adjacent cell relation exists, and adding a negative sample into the training set by using all cell pairs which are not listed in NC L and meet the following conditions:

wherein (x)₁,y₁)、(x₂,y₂) Respectively obtaining the longitude and latitude of base stations of the two cells according to the preliminarily constructed attributes 'station address longitude' and 'station address latitude' of the two cell entities in the wireless network knowledge graph of the target area; rc 6378137 is a constant; the MaxDis is a preset maximum neighbor distance.

(2) Traversing samples in the initial training sample set S', converting each cell pair into a link sequence matrix according to the following steps, and endowing positive and negative class labels, wherein all positive samples are marked as 1, all negative samples are marked as 0, and obtaining a sequence matrix training set S with labels:

(a) for the cell pair u and v, traversing the sampling point entities in the wireless network knowledge graph of the area, extracting all sampling points of the cell u and the cell v according to the attribute 'cell ID' field, and respectively storing the sampling points into the sets N (u) and N (v).

(b) Extracting all pairs (e) of sampling points from the sets N (u) and N (v) that satisfy the condition_u,e_v) Wherein e is_u∈N(u)、e_v∈ N (v), and sample point e_uAnd e_vThe "terminal ID" attribute values of (1) are the same.

(c) Extracting reference signal receiving power of sampling points from MCS or MR data set, and sequencing the sampling point pairsFirstly, sorting according to the terminal ID from small to large, and secondly, according to two sampling points e_uAnd e_vThe sum of the reference signal received powers is sorted from large to small to obtain the following n ordered sampling point pairs: (e)_u1,e_v1)，(e_u2,e_v2)....,(e_un,e_vn)。

(d) Constructing a node-linked sequence matrix A [ (3n +1) × d ]: where n is the number of maximum inputtable pairs of sample points and d is the maximum feature dimension. And extracting each entity and attribute information thereof from a wireless network knowledge map database and storing the entity and the attribute information into the matrix. The length of the link sequence matrix A is different due to different numbers of sampling point pairs between every two cells, the length of a maximum inputtable node sequence is specified to be 3n +1, and the insufficient part is supplemented with 0, so that the length of the link sequence matrix A between every two cells is consistent; in addition, the wireless network knowledge graph belongs to a heterogeneous network, namely, different types of nodes exist in the graph, the attribute number of each node is also different, a feature dictionary is generated by solving and combining the attributes of two types of nodes of a cell and a sampling point, d, the dimensionality of the feature dictionary is obtained, and 0 supplementing processing is carried out on feature items missing from the nodes.

Step 2: constructing CNN networks

(1) Inputting: the nodes link the sequence matrix a [ (3n +1) × d ].

(2) Convolution layer-weight matrix w ∈ R h d of convolution kernel]Wherein R represents a real number set, the height of the convolution kernel is h, a plurality of convolution kernels with different heights are set by modifying h, and the width of the convolution kernel is a characteristic dimension d. The convolution kernel w is slid on the sequence matrix A from top to bottom by step length 3, and the sequence characteristic c_iIs formed by a convolution kernel w and a matrix region x_3i-2:3i+h-3Performing convolution operation to obtain:

c_i＝f(w·x_3i-2:3i+h-3+b)

where f is a non-linear activation function such as hyperbolic tangent, b ∈ R is a bias term, convolution kernel w is applied to each region { x } of sequence matrix A_1:h,x_4:h+3,x_7:h+6,…,x_3n-h+2:3n+1The convolution operation will obtain a sequence feature C ═ C with 1 columns₁,c₂,…,c_(3n-h+4)/3]. The convolution kernel height h is set to 4,7,10, …,3m +1, wherein m is the number of preset convolution kernels, and m sequence features are obtained and respectively represent 1 and the relationship features between a plurality of sampling point pairs and the cells u and v.

(4) A pooling layer: using average pooling to perform down-sampling processing on the sequence features, and obtaining a sequence feature z after pooling each sequence feature_iFinally, m sequence features z ═ z will be obtained₁,z₂,…,z_m}。

(5) Full connection layer: inputting a sequence feature z of N x 1, outputting a vector y of T x 1, wherein T is a category number, taking prediction of whether the adjacent region relationship exists as a binary problem, and taking T as 2:

wherein W_denseIs a weight matrix of 2 x N; r is used for introducing Dropout operation, a 0 or 1 vector is randomly generated according to the probability p, and p is 0.5; b_denseIs a bias vector of N x 1.

(6) softmax layer: input y, output 2 x 1 vector prob p₁,p₂]＝softmax(y)，p₁And p₂Respectively representing the probability of the two cells having the neighboring cell relation and the probability of the two cells not having the neighboring cell relation.

And step 3: CNN model training

The CNN model is trained using a labeled sequence matrix training set S.

(1) Initializing convolution kernels and bias items of convolution layers, weight matrixes of all connection layers and bias vectors;

(2) defining hyper-parameters in the training process: maximum iteration times, iteration stop threshold, learning rate, and a loss function using cross entropy:

Loss＝-(t*log(p₁)+(1-t)*log(1-p₁))

where t represents the true value, i.e. the label of the sample, p₁And (4) the probability of the existence of the neighbor relation obtained in the step (2). (3) Sequentially inputting all sequence matrixes in the sequence matrix training set S with the labels into the model, and calculating according to the real labels of the training samples and the model prediction resultLoss value L oss, performing backward propagation layer by layer, and iteratively updating parameters of each layer by using a random gradient descent method;

(4) and when the iterative change of the loss value is smaller than a set threshold value or the maximum iteration number is reached, finishing the training of the CNN model.

And 4, step 4: neighbor relation prediction

Selecting a certain target area which is not manually marked by NC L, converting the cell pair to be predicted into sequence matrixes according to the steps 1(a) to (d), respectively inputting the sequence matrixes into the CNN model trained in the step 3, judging whether the cell pair has the neighbor relation according to the output probability, and marking the cell pair with the neighbor relation as Y and the cell pair without the neighbor relation as N.

And 5: post-treatment

And updating the NC L of the target area according to the neighbor prediction result, namely setting a 'mark' field to be 1 for a cell pair which is identified as Y and appears in the NC L to indicate that a correct neighbor relation is set for the cell pair, adding the cell pair to the NC L and setting a 'mark' field to be 1 for a cell pair which is identified as Y and does not appear in the NC L to indicate that the cell pair is a missed neighbor, deleting the cell pair from the NC L for a cell pair which is identified as N and appears in the NC L to indicate that the cell pair is a mismatched or multi-matched neighbor, and not performing any operation for a cell pair which is identified as N and does not appear in the NC L.

Further, the input data of step 1, (2), (c) is MCS or MR coverage sampling data set: the MCS data is composed of data collected from a large number of user terminals, and the MR data is composed of measurement information collected from a base station device and reported by each terminal device under the base station.

Further, the wireless network knowledge graph of the target area preliminarily constructed in step 1: the method at least comprises a base station entity, a cell entity, a terminal entity, a sampling point entity and attributes thereof, as well as a membership relationship, a residence relationship and an association relationship.

Further, the base station entity at least comprises a base station ID, a city, an operator, a network type, a large area ID, an administrative area, a station address longitude, a station address latitude, a cell number and a base station type attribute characteristic field; the cell entity at least comprises attribute characteristic fields such as cell ID, city, operator, network standard, large cell ID, base station ID, administrative region, station address longitude, station address latitude, base station type, direction angle, inclination angle, physical cell ID, frequency point number, coverage rate, coverage radius and the like; the terminal entity at least comprises attribute characteristic fields such as a terminal ID, a terminal brand, a terminal model, an operator, a network system and the like; the sampling point entity at least comprises attribute characteristic fields such as sampling point ID, terminal ID, sampling date, sampling time, longitude, latitude, city, administrative region, operator, network system, large region ID, base station ID, cell ID, physical cell ID, frequency point number, reference signal receiving power, reference signal receiving quality, signal-to-interference-and-noise ratio, terminal brand, terminal model and the like. The profile data is stored in a profile database.

Further, the current actual neighbor list NC L of each base station in the target area comprises fields including NC L ID, head cell ID, tail cell ID and mark, wherein the mark field indicates the correct neighbor relation after manual marking among cells in a certain local area, 1 indicates that the correct neighbor relation exists, and empty indicates that whether the correct neighbor relation is uncertain.

Compared with the prior art, the implementation of the method can effectively guarantee the effectiveness, the integrity and the timeliness of the adjacent region relation information, and is an important step for automatically constructing the wireless network knowledge graph. The method can be used as a supplement or even a substitute for the existing manual drive test and ANR technology, guides the network operation and maintenance department of an operator to more efficiently, timely and conveniently configure and manage the neighbor relation of the base station in the wireless network, and provides powerful support for improving the switching success rate of a user between cells in the network, improving service continuity and guaranteeing good service experience.

Compared with ANR: different data sources are adopted, no ANR function upgrading requirement is required on the eNB, no air interface measurement signaling burden is increased, and the effectiveness problem that the optimal neighbor cell is searched by utilizing the coverage distribution characteristics and the timestamp information under actual coverage instead of pairing only when a certain cell is measured in ANR is solved.

Predicting the adjacent area relation based on the CNN: mainly comprises a data preprocessing method and a design method of a convolution layer. The deep learning methods such as CNN can better solve the structural characteristics of Euclidean domain data such as images and videos, but the structural characteristics of nodes cannot be effectively extracted when non-Euclidean domain data are processed.

Drawings

Fig. 1 is a complete algorithm flow chart of the present invention.

Fig. 2. node-linked sequence matrix a [ (3n +1) × d ].

Fig. 3 shows a structure of the CNN network.

Fig. 4. convolution operation (step size 3, convolution kernel height h 4).

Detailed Description

The present invention will be described in detail below with reference to the accompanying drawings and examples.

The invention provides a neighbor relation prediction method based on a convolutional neural network.

The invention relates to a method for predicting the neighbor relation among cell entities by analyzing and reasoning various entities, attributes and entity relations (not including the neighbor relation among the cell entities) in a preliminarily constructed wireless network knowledge map through a convolutional neural network, and correspondingly calibrating the existing NC L maintained in the system (including adding new neighbor relation, deleting redundant or mismatched neighbor relation).

The specific steps are described in detail as follows:

inputting data:

(1) MCS or MR coverage sampling dataset: the MCS data is composed of data collected from a large number of user terminals, and the MR data is composed of measurement information collected from a base station device and reported by each terminal device under the base station. Two types of data contain characteristic fields: the MR data includes a terminal ID (generally referred to as IMSI), a sampling date, a sampling time, longitude, latitude, an operator, a network format, a large cell ID (TAC in the case of 4G and 5G networks, i.e., a tracking area code), a base station ID (eNBID in the case of 4G networks and gNBID in the case of 5G networks), a cell ID (cellid), a physical cell ID (pci), a frequency point number (EARFCN), reference signal received power (RSRP in the case of 4G networks and CSI-RSRP in the case of 5G networks), reference signal received quality (RSRQ in the case of 4G networks and CSI-RSRQ in the case of 5G networks), a signal to interference and noise ratio (SINR in the case of 4G networks and CSI-SINR in the case of 5G networks), and a neighbor cell information list (including network formats, large cell IDs, base station IDs, cell IDs, physical cell IDs, frequency point numbers, and reference signal received power of neighboring cells measured by the terminal). There are cases where partial fields are missing.

(2) The preliminarily constructed wireless network knowledge graph of the target area comprises the following steps: at least comprises a base station entity, a cell entity, a terminal entity, a sampling point entity (special entity) and attributes thereof, as well as a membership relationship (cell-base station), a residence relationship (terminal-cell) and an association relationship (sampling-cell, sampling-terminal). The base station entity at least comprises attribute characteristic fields such as base station ID, city, operator, network standard, large district ID, administrative district, site longitude, site latitude, cell number, base station type and the like; the cell entity at least comprises attribute characteristic fields such as cell ID, city, operator, network standard, large cell ID, base station ID, administrative region, station address longitude, station address latitude, base station type, direction angle, inclination angle, physical cell ID, frequency point number, coverage rate, coverage radius and the like; the terminal entity at least comprises attribute characteristic fields such as a terminal ID, a terminal brand, a terminal model, an operator, a network system and the like; the sampling point entity at least comprises attribute characteristic fields such as sampling point ID, terminal ID, sampling date, sampling time, longitude, latitude, city, administrative region, operator, network system, large region ID, base station ID, cell ID, physical cell ID, frequency point number, reference signal receiving power, reference signal receiving quality, signal-to-interference-and-noise ratio, terminal brand, terminal model and the like. The profile data is stored in a profile database.

(3) And (2) current actual neighbor list (NC L) of each base station in the target area, wherein fields at least comprise NC L ID, head cell ID, tail cell ID and mark, wherein the mark field refers to correct neighbor relation after manual marking among cells under a certain local area (in a certain geographic area or under a certain TAC), 1 indicates that the correct neighbor relation exists, and null indicates that whether the correct neighbor relation exists is uncertain, and the neighbor relation which does not exist under the local area is not listed in NC L.

Step 1: data pre-processing

(b) Extracting all samples satisfying the condition from the sets N (u) and N (v)Point pair (e)_u,e_v) Wherein e is_u∈N(u)、e_v∈ N (v), and sample point e_uAnd e_vThe "terminal ID" attribute values of (1) are the same.

(c) Extracting reference signal receiving power of sampling points from MCS or MR data set, and sequencing the sampling point pairs, firstly sequencing according to terminal ID from small to large, and secondly sequencing according to two sampling points e_uAnd e_vThe sum of the Reference Signal Received Power (RSRP) of (a) is sorted from large to small to obtain the following n ordered pairs of sample points: (e)_u1,e_v1)，(e_u2,e_v2)....,(e_un,e_vn)。

(d) Constructing a node-linked sequence matrix A [ (3n +1) × d ] as shown in FIG. 2: where n is the number of maximum inputtable pairs of sample points and d is the maximum feature dimension. And extracting each entity and attribute information thereof from a wireless network knowledge map database and storing the entity and the attribute information into the matrix. The length of the link sequence matrix A is different due to different numbers of sampling point pairs between every two cells, the length of a maximum inputtable node sequence is specified to be 3n +1, and the insufficient part is supplemented with 0, so that the length of the link sequence matrix A between every two cells is consistent; in addition, the wireless network knowledge graph belongs to a heterogeneous network, namely, different types of nodes exist in the graph, the attribute number of each node is also different, a feature dictionary is generated by solving and combining the attributes of two types of nodes of a cell and a sampling point, d, the dimensionality of the feature dictionary is obtained, and 0 supplementing processing is carried out on feature items missing from the nodes.

Step 2: constructing CNN networks

A CNN network as shown in fig. 3 is constructed:

(1) inputting: the node chaining sequence matrix A [ (3n +1) × d ], n is the number of maximum inputtable sampling point pairs, and d is the maximum feature dimension.

(2) Convolution layer-weight matrix w ∈ R h d of convolution kernel]Wherein R represents a real number set, the height of the convolution kernel is h, a plurality of convolution kernels with different heights can be set by modifying h, and the width of the convolution kernel is a characteristic dimension d. The convolution kernel w is slid on the sequence matrix A from top to bottom by step length 3, and the sequence characteristic c_iIs formed by a convolution kernel w and a matrix region x_3i-2:3i+h-3Performing convolution operation to obtainTo:

c_i＝f(w·x_3i-2:3i+h-3+b)

where f is a non-linear activation function such as hyperbolic tangent, b ∈ R is a bias term, convolution kernel w is applied to each region { x } of sequence matrix A_1:h,x_4:h+3,x_7:h+6,…,x_3n-h+2:3n+1The convolution operation will obtain a sequence feature C ═ C with 1 columns₁,c₂,…,c_(3n-h+4)/3]. Fig. 3 shows the convolution operation with a step size of 3 and a convolution kernel height h of 4. The height h of the convolution kernel can be set to 4,7,10, …,3m +1, where m is the number of the preset convolution kernels, and m sequence features are obtained, which respectively represent 1 and the relationship features between a plurality of sampling point pairs and the cells u and v.

(5) Full connection layer: inputting a sequence feature z of N × 1, and outputting a vector y of T × 1 (T is a class number, and T is 2 because a prediction of whether or not there is a neighbor relation is regarded as a binary problem):

wherein W_denseIs a weight matrix of 2 x N; r is used to introduce a Dropout operation to randomly generate a 0 or 1 vector with probability p, which may take 0.5; b_denseIs a bias vector of N x 1.

And step 3: CNN model training

The CNN model is trained using a labeled sequence matrix training set S.

(2) defining hyper-parameters in the training process: maximum iteration times, iteration stop threshold values, learning rate and the like, and the loss function can use cross entropy (comparing the output neighbor relation probability with the real neighbor relation):

Loss＝-(t*log(p₁)+(1-t)*log(1-p₁))

where t represents the true value, i.e. the label of the sample, p₁Sequentially inputting all sequence matrixes in the sequence matrix training set S with the labels into the model, calculating a loss value L oss according to the real labels of the training samples and the model prediction result, performing backward propagation layer by layer, and iteratively updating parameters of each layer by using a random gradient descent method;

And 4, step 4: neighbor relation prediction

And 5: post-treatment

Claims

1. A neighbor relation prediction method based on a convolutional neural network is characterized by comprising the following steps: the specific steps of the method are described below,

step 1: data pre-processing

(1) Constructing an initial training sample set S', namely selecting a manually marked local area of NC L, traversing NC L under the area, adding all pairwise cell pairs with a marking field of 1, namely determining that the adjacent cell relation exists, into the training set as positive samples, and adding all cell pairs which are not listed in NC L and meet the following conditions into the training set as negative samples:

wherein (x)₁,y₁)、(x₂,y₂) Respectively obtaining the longitude and latitude of base stations of the two cells according to the preliminarily constructed attributes 'station address longitude' and 'station address latitude' of the two cell entities in the wireless network knowledge graph of the target area; rc 6378137; the MaxdIs is a preset maximum adjacent cell distance;

(a) for the cell pairs u and v, traversing the sampling point entities in the wireless network knowledge graph of the area, extracting all sampling points of the cell u and the cell v according to the attribute 'cell ID' field, and respectively storing the sampling points into sets N (u) and N (v);

(b) extracting all pairs (e) of sampling points from the sets N (u) and N (v) that satisfy the condition_u,e_v) Wherein e is_u∈N(u)、e_v∈ N (v), and sample point e_uAnd e_vThe attribute values of the 'terminal ID' are the same;

(c) extracting reference signal receiving power of sampling points from MCS or MR data set, and sequencing the sampling point pairs, firstly sequencing according to terminal ID from small to large, and secondly sequencing according to two sampling points e_uAnd e_vThe sum of the reference signal received powers is sorted from large to small to obtain the following n ordered sampling point pairs: (e)_u1,e_v1)，(e_u2,e_v2)....,(e_un,e_vn)；

(d) Constructing a node chaining sequence matrix A [ (3n +1) × d ]; wherein n is the number of the maximum inputtable sampling point pairs, and d is the maximum characteristic dimension;

step 2: constructing CNN networks

(1) Inputting: the node-linked sequence matrix a [ (3n +1) × d ];

(2) convolution layer-weight matrix w ∈ R h d of convolution kernel]Wherein R represents a real number set, the height of a convolution kernel is h, a plurality of convolution kernels with different heights are set by modifying h, and the width of the convolution kernel is a characteristic dimension d; the convolution kernel w is slid on the sequence matrix A from top to bottom by step length 3, and the sequence characteristic c_iIs formed by a convolution kernel w and a matrix region x_3i-2:3i+h-3Performing convolution operation to obtain:

c_i＝f(w·x_3i-2:3i+h-3+b)

where f is a non-linear activation function such as hyperbolic tangent, b ∈ R is a bias term;

(4) a pooling layer: using average pooling to perform down-sampling processing on the sequence features, and obtaining a sequence feature z after pooling each sequence feature_iFinally, m sequence features z ═ z will be obtained₁,z₂,…,z_m}；

wherein W_denseIs a weight matrix of 2 x N; r is used for introducing Dropout operation, a 0 or 1 vector is randomly generated according to the probability p, and p is 0.5; b_denseIs a bias vector of N x 1;

(6) softmax layer: input y, output 2 x 1 vector prob p₁,p₂]＝softmax(y)，p₁And p₂Respectively representing the probability of the two cells having the adjacent cell relation and the probability of the two cells not having the adjacent cell relation;

and step 3: CNN model training

Training a CNN model by using a sequence matrix training set S with labels;

(2) defining hyper-parameters in the training process: maximum iteration times, iteration stopping threshold values and learning rate, wherein the loss function uses cross entropy;

(3) sequentially inputting all sequence matrixes in a sequence matrix training set S with labels into a model, calculating a loss value L oss according to real labels of training samples and a model prediction result, carrying out backward propagation layer by layer, and iteratively updating parameters of each layer by using a random gradient descent method;

(4) when the iteration change of the loss value is smaller than a set threshold value or reaches the maximum iteration times, finishing the training of the CNN model;

and 4, step 4: neighbor relation prediction

Selecting a certain target area which is not manually marked by NC L, converting the cell pair to be predicted into sequence matrixes according to (a) to (d) in the step 1, respectively inputting the sequence matrixes into the CNN model trained in the step 3, judging whether the cell pair has a neighbor relation according to the output probability, and marking the cell pair with the neighbor relation as Y and the cell pair without the neighbor relation as N;

and 5: post-treatment

And updating the NC L of the target area according to the neighbor prediction result.

2. The neighbor relation prediction method based on the convolutional neural network as claimed in claim 1, wherein: the (c) input data in (2) in step 1 is an MCS or MR coverage sampling data set: the MCS data is composed of data collected from a large number of user terminals, and the MR data is composed of measurement information collected from a base station device and reported by each terminal device under the base station.

3. The neighbor relation prediction method based on the convolutional neural network as claimed in claim 1, wherein: the wireless network knowledge graph of the target area preliminarily constructed in the step 1: the method at least comprises a base station entity, a cell entity, a terminal entity, a sampling point entity and attributes thereof, as well as a membership relationship, a residence relationship and an association relationship.

4. The neighbor relation prediction method based on the convolutional neural network as claimed in claim 1, wherein: the base station entity at least comprises base station ID, city, operator, network standard, large district ID, administrative district, site longitude, site latitude, cell number and base station type attribute characteristic field; the cell entity at least comprises cell ID, city, operator, network standard, large cell ID, base station ID, administrative region, station address longitude, station address latitude, base station type, direction angle, inclination angle, physical cell ID, frequency point number, coverage rate and coverage radius attribute characteristic field; the terminal entity at least comprises a terminal ID, a terminal brand, a terminal model, an operator and a network system attribute feature field; the sampling point entity at least comprises a sampling point ID, a terminal ID, a sampling date, sampling time, longitude, latitude, a city, an administrative region, an operator, a network type, a large region ID, a base station ID, a cell ID, a physical cell ID, a frequency point number, reference signal receiving power, reference signal receiving quality, a signal-to-interference-and-noise ratio, a terminal brand and a terminal model attribute characteristic field; the profile data is stored in a profile database.

5. The method as claimed in claim 1, wherein the actual neighbor list NC L field of each base station in the target area at present comprises NC L ID, head cell ID, tail cell ID, and label, wherein the label field indicates the correct neighbor relation between the cells in the local area after manual labeling, 1 indicates that the correct neighbor relation exists, and empty indicates that whether the correct neighbor relation is not determined.