CN113743050B - Article layout evaluation method, apparatus, electronic device and storage medium - Google Patents

Article layout evaluation method, apparatus, electronic device and storage medium Download PDF

Info

Publication number
CN113743050B
CN113743050B CN202111044587.2A CN202111044587A CN113743050B CN 113743050 B CN113743050 B CN 113743050B CN 202111044587 A CN202111044587 A CN 202111044587A CN 113743050 B CN113743050 B CN 113743050B
Authority
CN
China
Prior art keywords
node
nodes
feature
vector
article
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111044587.2A
Other languages
Chinese (zh)
Other versions
CN113743050A (en
Inventor
夏烽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202111044587.2A priority Critical patent/CN113743050B/en
Publication of CN113743050A publication Critical patent/CN113743050A/en
Application granted granted Critical
Publication of CN113743050B publication Critical patent/CN113743050B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/106Display of layout of documents; Previewing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/258Heading extraction; Automatic titling; Numbering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Biomedical Technology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment relates to the technical field of artificial intelligence and discloses an article layout evaluation method, an article layout evaluation device, electronic equipment and a storage medium. The method comprises the following steps: acquiring an article to be evaluated; performing node division on articles to be evaluated to obtain a plurality of nodes, wherein the nodes comprise title nodes and information nodes, and the information nodes comprise: at least one of a picture node, a video node and a text node; extracting a node characteristic vector of each node in the plurality of nodes; extracting a dependency feature vector of each node in the plurality of nodes, wherein the dependency feature vector of each node is used for indicating the dependency relationship between each node and other nodes in the plurality of nodes; and determining the layout score of the article to be evaluated according to the node characteristic vector of each node and the dependency characteristic vector of each node in the plurality of nodes. The article layout evaluation method can improve the accuracy of article evaluation.

Description

Article layout evaluation method, apparatus, electronic device and storage medium
Technical Field
The application relates to the technical field of artificial intelligence, in particular to an article layout evaluation method, an article layout evaluation device, electronic equipment and a storage medium.
Background
In the self-media era, the self-media articles are not checked and can be released at will, so that the network is filled with low-value lavage articles or title party articles, and the reading time of people is wasted.
In the related technology, in the process of scoring and evaluating the articles, the evaluation is generally only performed from the aspects of content depth and content quality of the articles, and the content of the articles is evaluated from the aspects of semantic features and literature, so that the evaluation of the articles is inaccurate, and the reading experience of readers is affected.
Disclosure of Invention
The embodiment of the disclosure mainly aims to provide an article layout evaluation method, an article layout evaluation device, electronic equipment and a storage medium, so as to improve the accuracy of article evaluation.
To achieve the above object, a first aspect of an embodiment of the present disclosure proposes an article layout evaluation method, including:
acquiring an article to be evaluated;
performing node division on articles to be evaluated to obtain a plurality of nodes, wherein the nodes comprise title nodes and information nodes, and the information nodes comprise: at least one of a picture node, a video node and a text node;
extracting a node characteristic vector of each node in the plurality of nodes, wherein the node characteristic vector is used for indicating the characteristics carried by each node;
Extracting a dependency feature vector of each node in the plurality of nodes, wherein the dependency feature vector of each node is used for indicating the dependency relationship between each node and other nodes in the plurality of nodes;
and determining the layout score of the article to be evaluated according to the node characteristic vector of each node and the dependency characteristic vector of each node in the plurality of nodes.
In some embodiments, determining the layout score of the article to be evaluated from the node feature vector of each node of the plurality of nodes and the dependency feature vector of each node comprises:
splicing the node characteristic vector of each node and the dependency characteristic vector of each node to obtain a fusion characteristic vector of each node;
and determining the layout score of the article to be evaluated according to the fusion feature vector of each node in the plurality of nodes.
In some embodiments, determining a layout score for the article to be evaluated based on the fused feature vector for each of the plurality of nodes comprises:
carrying out graph convolution and pooling processing on the fusion feature vector of each node by using a first neural network model to obtain a first type feature vector;
performing loop iteration processing on the fusion feature vector of each node by using a second neural network model to obtain a second class feature vector, wherein the first neural network model and the second neural network model belong to different types of neural network models;
Performing splicing processing on the first type of feature vectors and the second type of feature vectors to obtain spliced feature vectors;
and determining the layout scores of the articles to be evaluated according to the spliced feature vectors.
In some embodiments, extracting a dependency feature vector for each of a plurality of nodes includes:
under the condition that a current node in the plurality of nodes is a picture node, a picture drawing feature sub-vector of the current node is proposed from a node feature vector of the current node;
determining the dependency relationship between the current node and other nodes in the plurality of nodes according to the picture drawing feature sub-vector of the current node and the node feature vectors of the other nodes in the plurality of nodes;
and determining the dependency characteristic vector of the current node according to the dependency relationship between the current node and other nodes in the plurality of nodes.
In some embodiments, determining the dependency relationship between the current node and the other nodes of the plurality of nodes according to the picture annotation feature sub-vector of the current node and the node feature vectors of the other nodes of the plurality of nodes comprises:
extracting semantic features of the picture drawing feature sub-vectors to obtain a picture labeling field;
extracting the characteristics of the image annotation field and the node characteristic vectors of other nodes in the plurality of nodes to obtain the association values of the image annotation characteristic sub-vector and the node characteristic vectors of other nodes in the plurality of nodes;
If the association value of the picture drawing feature sub-vector of the current node and the node feature vectors of other nodes in the plurality of nodes exceeds a preset association threshold, determining that the dependency feature vector of the current node and the other nodes in the plurality of nodes is a supplementary relation;
if the association value of the picture drawing feature sub-vector of the current node and the node feature vectors of other nodes in the plurality of nodes does not exceed the preset association threshold, determining that the dependency feature vectors of the current node and the other nodes in the plurality of nodes are in a linear relationship.
In some embodiments, the plurality of nodes includes a first node, the first node being any one of the plurality of nodes;
in the case that the first node is a text node, the node feature vector of the first node includes: character number feature sub-vectors and character position feature sub-vectors;
in the case that the first node is a picture node, the node feature vector of the first node includes: picture drawing feature sub-vector, picture position feature sub-vector, picture pixel feature sub-vector, and picture size feature sub-vector;
in the case that the first node is a video node, the node feature vector of the first node includes: video length feature sub-vectors and video position feature sub-vectors;
In the case that the first node is a header node, the node feature vector of the first node includes: title location feature sub-vector.
In some embodiments, the header node comprises: article title node, large title node, small title node.
To achieve the above object, a second aspect of the present disclosure proposes an article layout evaluation device, including:
the article acquisition module is used for acquiring articles to be evaluated;
the node dividing module is used for carrying out node division on the article to be evaluated so as to obtain a plurality of nodes, wherein the plurality of nodes comprise title nodes and information nodes, and the information nodes comprise: at least one of a picture node, a video node and a text node;
the node characteristic extraction module is used for extracting a node characteristic vector of each node in the plurality of nodes, wherein the node characteristic vector is used for indicating the characteristics carried by each node;
a dependency feature extraction module, configured to extract a dependency feature vector of each node in the plurality of nodes, where the dependency feature vector of each node is used to indicate a dependency relationship between each node and other nodes in the plurality of nodes;
and the scoring module is used for determining the layout score of the article to be evaluated according to the node characteristic vector of each node in the plurality of nodes and the dependency characteristic vector of each node.
To achieve the above object, a third aspect of the present disclosure proposes an electronic device including:
at least one memory;
at least one processor;
at least one program;
the program is stored in the memory, and the processor executes at least one program to implement:
the method of the first aspect as described above.
To achieve the above object, a fourth aspect of the present disclosure proposes a storage medium that is a computer-readable storage medium storing computer-executable instructions for causing a computer to execute:
the method of the first aspect as described above.
According to the article layout evaluation method, the device, the electronic equipment and the storage medium, when the articles are evaluated, the articles to be evaluated are subjected to node division to obtain a plurality of nodes, wherein the plurality of nodes comprise at least one of a title node and a picture node, a video node and a text node, then node feature vectors of each of the plurality of nodes are extracted, dependency feature vectors of each of the plurality of nodes are extracted, wherein the dependency feature vectors are used for indicating the dependency relationship between each node and other nodes in the plurality of nodes, and then layout scores of the articles to be evaluated are determined according to the node feature vectors of each node and the dependency feature vectors of each node. When the article to be evaluated is evaluated, the node characteristic vector of each node is considered, the dependency characteristic vector of each node is considered, the article to be evaluated is evaluated by combining two different angles, and the accuracy of article evaluation is improved.
Drawings
Fig. 1 is a flowchart of an article layout evaluation method according to an embodiment of the present application.
Fig. 2 is a flowchart of step S500 in fig. 1.
Fig. 3 is a flowchart of step S520 in fig. 2.
Fig. 4 is a schematic diagram of an application scenario of the article layout evaluation method according to the embodiment of the present application.
Fig. 5 is a flowchart of step S300 in fig. 1.
Fig. 6 is a flowchart of step S320 in fig. 5.
Fig. 7 is a block diagram of an article layout evaluation device according to an embodiment of the present application.
Fig. 8 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the disclosure.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
It should be noted that although functional block division is performed in a device diagram and a logic sequence is shown in a flowchart, in some cases, the steps shown or described may be performed in a different order than the block division in the device, or in the flowchart. The terms first, second and the like in the description and in the claims and in the above-described figures, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the application only and is not intended to be limiting of the application.
First, several nouns involved in the present application are parsed:
artificial intelligence (artificial intelligence, AI): is a new technical science for researching and developing theories, methods, technologies and application systems for simulating, extending and expanding the intelligence of people; artificial intelligence is a branch of computer science that attempts to understand the nature of intelligence and to produce a new intelligent machine that can react in a manner similar to human intelligence, research in this field including robotics, language recognition, image recognition, natural language processing, and expert systems. Artificial intelligence can simulate the information process of consciousness and thinking of people. Artificial intelligence is also a theory, method, technique, and application system that utilizes a digital computer or digital computer-controlled machine to simulate, extend, and expand human intelligence, sense the environment, acquire knowledge, and use knowledge to obtain optimal results.
Natural language processing (natural language processing, NLP): NLP is a branch of artificial intelligence that is a interdisciplinary of computer science and linguistics, and is often referred to as computational linguistics, and is processed, understood, and applied to human languages (e.g., chinese, english, etc.). Natural language processing includes parsing, semantic analysis, chapter understanding, and the like. Natural language processing is commonly used in the technical fields of machine translation, handwriting and print character recognition, voice recognition and text-to-speech conversion, information retrieval, information extraction and filtering, text classification and clustering, public opinion analysis and viewpoint mining, and the like, and relates to data mining, machine learning, knowledge acquisition, knowledge engineering, artificial intelligence research, linguistic research related to language calculation, and the like.
Medical cloud (Medical cloud) refers to the fact that a Medical health service cloud platform is created by combining Medical technology on the basis of new technologies such as cloud computing, mobile technology, multimedia, 4G communication, big data, internet of things and the like, and Medical resources are shared and Medical scope is enlarged. Because the cloud computing technology is applied to combination, the medical cloud improves the efficiency of medical institutions, and residents can conveniently seek medical advice. Like reservation registration, electronic medical records, medical insurance and the like of the traditional hospital are products of combination of cloud computing and medical field, and the medical cloud also has the advantages of data security, information sharing, dynamic expansion and overall layout.
Character recognition (optical character recognition, OCR): refers to a process in which an electronic device (e.g., a scanner or a digital camera) checks characters printed on paper and then translates the shape into computer text using a character recognition method; namely, the text data is scanned, and then the image file is analyzed and processed to obtain the text and layout information.
Convolutional neural networks (Convolutional Neural Networks, CNN) are a type of feedforward neural network (Feedforward Neural Networks, FNN) that contains convolutional computations and has a deep structure, and are one of the representative algorithms of deep learning. The convolutional neural network is also called as a Shift-invariant artificial neural network (Shift-Invariant Artificial Neural Networks, SIANN) because of its ability to perform Shift-invariant classification (Shift-invariant classification). The artificial neuron of the convolutional neural network can respond to a part of surrounding units in the coverage area, and has excellent performance for large-scale image processing. Along with the development of deep learning theory and the improvement of numerical computing equipment, convolutional neural networks are rapidly developed and are applied to the fields of computer vision, natural language processing and the like. A Convolutional Neural Network (CNN) model generally comprises the following parts: input layer, hidden layer and output layer.
Input layer: the input of the whole network.
The hidden layers of the convolutional neural network comprise a convolutional layer, a pooling layer and a fully-connected layer, wherein the convolutional layer and the pooling layer are special to the convolutional neural network.
Convolution layer (convolutional layer): the function of the convolution layer is to perform feature extraction on the input data, and the convolution layer internally contains a plurality of convolution kernels, wherein each element forming the convolution kernels corresponds to a weight coefficient and a bias vector, and is similar to a neuron (neuron) of a feedforward neural network. The convolutional layer parameters include convolutional kernel size, step size, and padding.
Convolution kernel: for a part of the area in the input image, a weighted average process is performed, wherein the weight of the process is defined by a function, which is a convolution kernel. The convolution kernel is the core of the whole network, and the process of training the CNN is the process of continuously updating the parameters of the convolution kernel until the parameters are optimal.
Pooling layer: after the feature extraction is performed by the convolution layer, the output feature map is transferred to the pooling layer for feature selection and information filtering. The pooling layer contains a predefined pooling function that functions to replace the results of individual points in the feature map with the feature map statistics of its neighboring regions. The pooling layer is introduced to reduce and abstract the visual input objects in a manner that emulates a human visual system. Generally has the following effects:
First: the characteristics are not deformed: the pooling operation is where the model is more concerned about whether certain features exist than feature-specific locations.
Second,: feature dimension reduction: pooling is equivalent to a dimensional reduction in space, allowing the model to extract a wider range of features. Meanwhile, the input size of the next layer is reduced, and the calculated amount and the parameter number are further reduced.
Third,: prevent the overfitting, more convenient optimization.
Full-connected layer): the fully connected layer is typically built on the last part of the hidden layer of the convolutional neural network and only transmits signals to the other fully connected layers. The feature map loses 3-dimensional structure in the fully connected layer, is expanded into vectors and passes through the excitation functions to the next layer.
Output layer: upstream of the output layer of the convolutional neural network is typically a fully-connected layer, which outputs class labels using a logic function or a normalized exponential function (softmax function).
GRU (Gate Recurrent Unit) the neural network is one of the recurrent neural networks (Recurrent Neural Network, RNN). A recurrent neural network (Recurrent Nural Network, RNN) is a neural network that can learn the spatiotemporal relationship of sequence elements. RNNs and their variants, which are commonly used in the field of natural language processing in sequence modeling, can be highly scalable in tasks such as text classification, sequence labeling, machine translation, etc.
The graph convolutional neural network (Graph Convolutional Network, GCN) is a method that enables deep learning of graph data. Conventional Convolutional Neural Networks (CNNs) are one of the most successful applications of deep learning, and the main limitation is Euclidean data (regular spatial structure), such as: the picture is a regular square grid. In the graph data, the characteristic information and the structural information of the nodes need to be considered at the same time, and if the characteristic information and the structural information are extracted by manual rules, a plurality of implicit and complex modes are lost. In order to learn the characteristic information and the structure information of the picture at the same time, a 'picture convolution neural network' is provided to realize the characteristic extraction of the picture.
The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.
Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
Based on the above, the embodiment of the application provides an article layout evaluation method, an article layout evaluation device, electronic equipment and a storage medium, so as to improve the accuracy of article evaluation.
The embodiment of the application provides an article layout evaluation method, and relates to the technical field of artificial intelligence. The article layout evaluation method provided by the embodiment of the application can be applied to a terminal, a server and software running in the terminal or the server. In some embodiments, the terminal may be a smart phone, tablet, notebook, desktop, or smart watch, etc.; the server side can be configured as an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and a cloud server for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDN), basic cloud computing services such as big data and artificial intelligent platforms and the like; the software may be an application or the like that implements the article layout evaluation method, but is not limited to the above form.
The detailed evaluation process of the article layout evaluation method in the present application will be described in detail with reference to fig. 1.
As shown in fig. 1, in a first aspect, some embodiments of the present application provide an article layout evaluation method, including but not limited to step S100, step S200, step S300, step S400, and step S500.
Step S100: acquiring an article to be evaluated;
specifically, the article to be detected can be an article uploaded by readers or article auditors, or can be an article in some public numbers; the article can be a paper, a composition, a news media report, a medical science popularization article taking medical treatment as a main body or a medical related paper, and the like, wherein the medical science popularization article or the medical related paper can be obtained from a medical cloud server.
Step S200: performing node division on the article to be evaluated to obtain a plurality of nodes, wherein the plurality of nodes comprise a title node and an information node, and the information node comprises at least one of a picture node, a video node and a text node;
for example: for an article T1 of a certain medical science popularization class, the article T1 comprises titles, characters, pictures, videos and the like, the article T1 is subjected to node division, and a plurality of title nodes, a plurality of character nodes, a plurality of picture nodes and a plurality of video nodes can be obtained, wherein the number of the picture nodes, the video nodes and the title nodes is related to the number of the pictures, the videos and the titles in the article T1, and the division of the character nodes and the title nodes is related to a specific line text sequence. Such as: in the article T1, 3 pictures, 1 video, 1 article title and 3 large titles are included, so that 3 picture nodes, 1 video node and 4 title nodes are obtained when the nodes are divided. For text nodes, a paragraph may be used as a text node for division, or other forms may be used, which is not particularly limited in the present application.
Step S300: extracting a node characteristic vector of each node in the plurality of nodes, wherein the node characteristic vector is used for indicating the characteristics carried by each node;
step S400: extracting a dependency feature vector of each node in the plurality of nodes, wherein the dependency feature vector of each node is used for indicating the dependency relationship between each node and other nodes in the plurality of nodes;
specifically, in the above steps S300 and S400, the node feature vector of each node and the dependency feature vector of each node are acquired. The node characteristic vector of a node can be carried by the node, and the node characteristic vector of the node can be obtained by extracting the characteristics carried by the node.
And converting the article to be evaluated into a picture to obtain a picture to be evaluated for the dependency feature vector of each node, and then carrying out feature extraction on the picture to be evaluated by adopting a neural network to obtain the dependency feature vector of the node. When the neural network is used for processing the evaluation picture, a graph roll-up neural network (GCN) model can be used for processing, wherein the GCN can be set to be 2 layers or other layers, and the application is not limited to the above.
After node division, extracting features of the nodes to obtain node feature vectors representing features of each node, and extracting dependency features of each node to obtain dependency feature vectors of each node, wherein the dependency feature vectors are used for indicating dependency relations between each node and other nodes in the plurality of nodes. For example, the layout of the article T2 is: article title, headline, subtitle, first text description, first picture, second text description. And after the article T is subjected to node division, extracting the characteristics to obtain node characteristic vectors and dependency characteristic vectors, wherein each node is provided with a node characteristic vector and a dependency characteristic vector. For example, in the case of a subtitle, the subtitle is divided into the title nodes when the node division is performed, and the node feature vector and the dependency feature vector of the subtitle are obtained when the feature extraction is performed. The node feature vector characterizes the own features of the subtitle, such as the position of the subtitle in article T2, etc. The dependency feature vector represents the association with other nodes, such as: the subtitle contains a first text description, a first picture and a second text description, and the dependency feature vector of the subtitle node indicates the inclusion relationship of the subtitle to the three nodes.
Step S500: and determining the layout score of the article to be evaluated according to the node characteristic vector of each node and the dependency characteristic vector of each node in the plurality of nodes.
A good article is not only reflected in the semantics and the literature, but also reflected in the connection relation between paragraphs, the relation between sentences in the paragraphs, the illustration of the proper position and the like. In the related technology, the sentences or paragraphs are scored only from the content depth and content quality of the articles, and although the semantic features of each sentence or paragraph are enough and the thesaurus is enough and magnificent, the phenomenon that the connection relations of the sentences or paragraphs are not matched or correspond to each other easily occurs, so that the reading experience of readers is affected. Therefore, an embodiment of the present application provides an article layout evaluation method, when an article is evaluated, the article to be evaluated is subjected to node division to obtain a plurality of nodes, wherein the plurality of nodes include a title node and at least one of a picture node, a video node and a text node, then node feature vectors of each of the plurality of nodes are extracted, dependency feature vectors of each of the plurality of nodes are extracted, wherein the dependency feature vectors are used for indicating dependency relations between each node and other nodes in the plurality of nodes, and then layout scores of the article to be evaluated are determined according to the node feature vectors of each node and the dependency feature vectors of each node. When the article to be evaluated is evaluated, the node characteristic vector of each node is considered, the dependency characteristic vector of each node is considered, the article to be evaluated is evaluated by combining two different angles, and the accuracy of article evaluation is improved.
In some embodiments, the header node comprises: article title node, large title node, small title node.
Specifically, each article generally has an article title, a plurality of large titles are included under the article title, each large title comprises a plurality of sub-titles, and node division is performed on the article to be evaluated, so that article title nodes, a plurality of large title nodes and a plurality of sub-title nodes can be obtained. For example: in an article, the article title is an article title, the introduction, abstract, the appendix, etc. are large titles, and when the article is subjected to node division, the article title is divided into article title nodes, and the introduction, abstract, the appendix, etc. are divided into large title nodes.
In some embodiments, the plurality of nodes includes a first node, the first node being any one of the plurality of nodes;
in the case that the first node is a text node, the node feature vector of the first node includes: character number feature sub-vectors and character position feature sub-vectors;
in the case that the first node is a picture node, the node feature vector of the first node includes: picture drawing feature sub-vector, picture position feature sub-vector, picture pixel feature sub-vector, and picture size feature sub-vector;
In the case that the first node is a video node, the node feature vector of the first node includes: video length feature sub-vectors and video position feature sub-vectors;
in the case that the first node is a header node, the node feature vector of the first node includes: title location feature sub-vector.
An article may be divided into a title node and at least one of a picture node, a video node and a text node. The title nodes comprise an article title node, a big title node and a subtitle node.
The node feature vectors of the respective nodes are briefly described below.
The node feature vector of the title node includes: the title location feature sub-vector is used to indicate the specific location information of the title in the article.
The node characteristic vector of the text node comprises: character number feature sub-vectors and character position feature sub-vectors; the character number feature sub-vector is used for indicating character number information of characters in the article, if a certain character node has 500 characters, the character number feature sub-vector of the character node represents that the character node has 500 characters, and the character position feature sub-vector is used for indicating specific position information of the character in the article, if the certain character node is positioned at the tail end of the article, the character position feature sub-vector of the character node. The node characteristics of the picture node include: picture note feature sub-vectors, picture position feature sub-vectors, picture pixel feature sub-vectors, and picture size feature sub-vectors.
The picture annotating feature sub-vector is used for indicating content information of the picture, such as: the volt-ampere characteristic curve graph is used for indicating specific position information of the picture in an article, the picture pixel characteristic sub-vector is used for indicating pixel information of the picture, and the picture size characteristic sub-vector is used for indicating size information of the picture.
The node feature vector of the video node includes: the video position feature sub-vector is used for indicating specific position information of the video node in the article, and the video length feature sub-vector is used for indicating content length information of the video.
The dependencies between each node and other nodes in the plurality of nodes include, but are not limited to: parallel relationship, parent-child containing relationship, supplementary relationship, linear relationship, etc., wherein the linear relationship includes: front, middle and rear connection. For example, the large header node includes: the large title node A comprises a sub-title node a, the sub-title node comprises a character node B, a character node c and a character node d, the article title node and the large title node A and the article title node and the large title node B are in father-son containing relationship, the large title node A and the large title node B are in parallel relationship, the large title node A and the sub-title node a are in father-son containing relationship, and the character node B, the character node c and the character node d are in linear relationship.
The process of determining the layout scores of the articles to be evaluated in S500 described above is described in detail below with reference to fig. 2.
As shown in fig. 2, step S500 includes step S510 and step S520. These two steps are described in detail below. It should be understood that S500 includes, but is not limited to, step S510 and step S520 in the present application.
Step S510: splicing the node characteristic vector of each node and the dependency characteristic vector of each node to obtain a fusion characteristic vector of each node;
step S520: and determining the layout score of the article to be evaluated according to the fusion feature vector of each node in the plurality of nodes.
Specifically, the node feature vector of each node and the dependency feature vector of each node are spliced to obtain the fusion feature vector of each node, the fusion feature vector of each node not only comprises the node feature vector of each node, but also comprises the dependency relationship between each node and a plurality of nodes, the layout score of the article to be evaluated is determined according to the fusion feature vector of each node, the article to be evaluated is evaluated by combining two different angles, and the article evaluation accuracy is improved.
As shown in fig. 3, step S520 includes step S521, step S522, step S523, and step S524. These three steps are described in detail below. It should be understood that S520 in the present application includes, but is not limited to, step S521, step S522, step S523, and step S524.
Step S521: carrying out graph convolution and pooling processing on the fusion feature vector of each node by using a first neural network model to obtain a first type feature vector;
step S522: performing loop iteration processing on the fusion feature vector of each node by using a second neural network model to obtain a second class feature vector, wherein the first neural network model and the second neural network model belong to different types of neural network models;
step S523: performing splicing processing on the first type of feature vectors and the second type of feature vectors to obtain spliced feature vectors;
step S524: and determining the layout scores of the articles to be evaluated according to the spliced feature vectors.
The fusion feature vector of each node is processed by two different neural network models respectively, the first neural network model is used for carrying out graph rolling and pooling processing on the fusion feature vector of each node to obtain a first type feature vector, and the second neural network model is used for carrying out cyclic iteration processing on the fusion feature vector of each node to obtain a second type feature vector; the functions of different neural network models are different, and the accuracy of article evaluation can be improved by adopting two neural network models to extract the characteristics of the fusion characteristic vector.
In some embodiments of the application, the first neural network model is a convolutional neural network CNN model and the second neural network model is a recurrent neural network GRU model.
Specifically, the node feature vector of each node may be represented as an n-dimensional feature vector, after converting the article to be detected into a picture to be evaluated, and after performing graph convolution on the picture to be evaluated, the dependency feature vector of each node may be represented as an m-dimensional feature vector. And performing splicing processing on the node characteristic vector and the dependency characteristic vector of each node to obtain a fusion characteristic vector of each node, wherein the fusion characteristic vector of each node can be expressed as an m+n-dimensional characteristic vector. A1, A2 and A3 … Ak, wherein Ai E R (m+n), ai represents fusion eigenvectors, and k represents the number of nodes, such as: a1 represents the fusion feature vector of the first node, A2 represents the fusion feature vector of the second node, and Ak represents the fusion feature vector of the kth node; the fused eigenvectors of all nodes can be represented as a matrix of features k (m+n).
Convolutional Neural Network (CNN) processing: and carrying out 1-dimensional graph convolution on the fusion feature vectors A1, A2 and A3 … Ak, wherein the number of convolution kernels is f, carrying out graph convolution on the fusion feature vectors, then connecting a maximum pooling layer, and converting a k (m+n) feature matrix into a 1*f feature matrix to obtain the first-class feature vector. Wherein Ai represents the fusion feature vector of any one of the article nodes. And splicing the node feature vectors and the dependency feature vectors of all the nodes of the article to obtain a fusion feature vector of each node, then carrying out 1-dimensional graph convolution on all the node feature vectors, and then connecting with a maximum pooling layer to obtain a first type feature vector. Such as: article T has 3 nodes, respectively: article title node, picture node and text node. And extracting features of the article T to obtain node feature vectors and dependency feature vectors of the article heading nodes, node feature vectors and dependency feature vectors of the picture nodes, and node feature vectors and dependency feature vectors of the text nodes. And then, the node feature vectors and the dependency feature vectors are spliced to obtain the fusion feature vectors of the article title nodes, the fusion feature vectors of the picture nodes and the fusion feature vectors of the text nodes. And convolving the fusion feature vectors of the 3 nodes by using a CNN neural network to obtain the first type feature vector of the article T.
Recurrent neural network (GRU processing procedure): and inputting the fusion feature vectors A1, A2 and A3 … Ak to the BI-GRU layer, wherein k is the step length, and then splicing the last hidden layers of the bidirectional GRU together to obtain the second type feature vector. The method comprises the following steps: after the fusion feature vectors A1, A2, A3 … Ak are input into the first layer GRU, the following are obtained: a1', A2', A3'… Ak' after passing through the second layer GRU, the following results: ak '… A3', A2 'and A1', A1 'and A1' are spliced, A2 'and A2' are spliced, A3 'and A3' are spliced … Ak 'and Ak' are spliced, and a second type of feature vector is obtained. For example: article T has 3 nodes, respectively: article title node, picture node and text node. And extracting features of the article T to obtain node feature vectors and dependency feature vectors of the article heading nodes, node feature vectors and dependency feature vectors of the picture nodes, and node feature vectors and dependency feature vectors of the text nodes. And then, the node feature vectors and the dependency feature vectors are spliced to obtain the fusion feature vectors of the article title nodes, the fusion feature vectors of the picture nodes and the fusion feature vectors of the text nodes. The fusion feature vector of the article title node, the fusion feature vector of the picture node and the fusion feature vector of the text node are respectively input into the two layers of GRU neural networks to obtain the fusion feature vector of the first layer of the article title node, the fusion feature vector of the first layer of the picture node, the fusion feature vector of the first layer of the text node, the fusion feature vector of the second layer of the article title node, the fusion feature vector of the second layer of the picture node and the fusion feature vector of the second layer of the text node, and then the fusion feature vector of the first layer of the article title node and the fusion feature vector of the second layer of the article title node are spliced to obtain the second type feature vector of the article title node, and similarly, the second type feature vector of the picture node and the second type feature vector of the text node are obtained.
And determining the layout scores of the articles to be evaluated according to the first type of feature vectors and the second type of feature vectors.
The cyclic neural network is a neural network that can learn the spatiotemporal relationship of sequence elements. The hidden layer of the cyclic neural network can be calculated according to the content input by the input layer, and the calculated result of the hidden layer of the previous neural network can be used as one of references. For example, referring to table 1, table 1 is a schematic table of results obtained by feature extraction of the recurrent neural network.
Table 1: schematic table of results obtained by feature extraction of cyclic neural network
In table 1, there are: sentence a and sentence B, wherein sentence a is: the teacher criticized ____. Sentence B is: i last day learned late and the teacher criticized ____. If only sentence a is analyzed, it is impossible to get who the teacher criticizes, but the cyclic neural network feature extraction is performed on sentence B, so that it is possible to get: teacher criticized me.
The bidirectional cyclic neural network not only considers the calculation result of the front neural network, but also takes the calculation result of the rear neural network as one of calculation references. That is, in the present application, the second type feature vector needs to splice the output of the first layer and the output of the second layer. Namely, splicing A1 'and A1', splicing A2 'and A2', splicing A3 'and A3' and splicing … Ak 'and Ak', and obtaining a second type of feature vector.
According to the application, the GRU neural network is adopted, so that the sequence relation between the fusion feature vectors can be extracted, and the accuracy of article evaluation can be improved.
In step S524 and step S525, the first class feature vector and the second class feature vector are spliced to obtain a spliced feature vector, and then the spliced feature vector is accessed into the fully-connected layer to output a layout score, where the layout score is a score category, and the score category includes: excellent layout, good layout, medium layout and poor layout.
The detailed evaluation process of the article layout evaluation method in the present application will be described in detail with reference to fig. 4.
Fig. 4 is a schematic diagram of an application scenario of the article layout evaluation method according to the embodiment of the present application, as shown in fig. 4.
After obtaining an article to be evaluated, converting the article to be evaluated into an image to be evaluated, and performing node division on the article to be evaluated to obtain a plurality of nodes, wherein the plurality of nodes comprise: a title node and any one of a picture node, a text node and a video node. Each node carries a node characteristic vector indicating the characteristic of the node, and the node characteristic vector of the node can be obtained by extracting the characteristic of the node.
In this embodiment, the node division is performed to obtain an article title node, a large title node, a subtitle node, a text node (a node corresponding to text 1 in fig. 4), a picture node, and a text node (a node corresponding to text 2 in fig. 4). Taking the node feature vector of the text node corresponding to the text 2 as an example, the node feature vector is described, and for the text 2, the text 2 is at the end of the article, and the node feature vector of the text 2 indicates: the location information of word 2 at the end of the article and the word count information of word 2.
And extracting the characteristics of the picture to be evaluated by adopting a neural network for the dependency characteristic vector of each node so as to obtain the dependency characteristic vector of the node. When the neural network is used to process the picture to be evaluated, the picture roll-up neural network (GCN) may be used to process the picture, where the GCN may be set to 2 layers (or may be set to other layers, which is not limited in the present application).
In this embodiment, the dependency feature vector of the text node corresponding to the text 1 in fig. 4 is taken as an example, and the dependency feature vector will be described. As shown in fig. 4, for word 1, picture a, and word 2 are linear relations of front, middle, and back, subtitle and word 1 are parent-child inclusion relations, subtitle and word are parent-child inclusion relations, and article title and word 1 are parent-child inclusion relations. Extracting the dependency characteristic of the character 1 to obtain a dependency characteristic vector corresponding to the character 1, wherein the dependency characteristic vector indicates: the linear relation between the text 1, the picture a and the text 2, the father and son containing relation between the subtitle and the text 1, the father and son containing relation between the large title and the text, and the father and son containing relation between the article title and the text 1.
And performing splicing processing on the dependency feature vector and the node feature vector of each node to obtain a fusion feature vector of each node, and performing feature extraction on the fusion feature vectors of all nodes through a convolutional neural network CNN model to obtain a first type of feature vector. The convolutional neural network CNN model is extracted as follows:
and carrying out 1-dimensional graph convolution on the fusion feature vectors A1, A2 and A3 … Ak, wherein the number of convolution kernels is f, carrying out graph convolution on the fusion feature vectors, then connecting a maximum pooling layer, and converting a k (m+n) feature matrix into a 1*f feature matrix to obtain the first-class feature vector. In this embodiment, an article title node is denoted by A1, A2 denotes a large title node, A3 denotes a subtitle node, A4 denotes a first text node, A5 denotes a picture node, and A6 denotes a second text node.
And extracting the characteristics of the fusion characteristic vectors of all the nodes through a cyclic neural network GRU model to obtain a second type of characteristic vector. The extraction process of the GRU model of the cyclic neural network is as follows:
and inputting the fusion feature vectors A1, A2 and A3 … Ak to the BI-GRU layer, wherein k is the step length, and then splicing the last hidden layers of the bidirectional GRU together to obtain the second type feature vector. The method comprises the following steps: after the fusion feature vectors A1, A2, A3 … Ak are input into the first layer GRU, the following are obtained: a1', A2', A3'… Ak' after passing through the second layer GRU, the following results: ak '… A3', A2 'and A1', A1 'and A1' are spliced, A2 'and A2' are spliced, A3 'and A3' are spliced … Ak 'and Ak' are spliced, and a second type of feature vector is obtained.
Then, the first type of feature vectors and the second type of feature vectors are subjected to splicing processing to obtain spliced feature vectors, and the spliced feature vectors pass through a full-connection layer to output layout scoring categories of articles to be evaluated, wherein the layout scoring categories comprise: excellent layout, good layout, medium layout and poor layout.
The process of extracting the dependency feature vector of each of the plurality of nodes in the present application is described in detail below with reference to fig. 5.
As shown in fig. 5, the step of "extracting the dependency feature vector of each of the plurality of nodes" includes step S310, step S320, and step S330. Step S310, step S320 and step S330 are described in detail below. It should be understood that the step of extracting the dependency feature vector of each of the plurality of nodes includes, but is not limited to, step S310, step S320, and step S330.
Step S310: under the condition that a current node in the plurality of nodes is a picture node, a picture drawing feature sub-vector of the current node is proposed from a node feature vector of the current node;
step S320: determining the dependency relationship between the current node and other nodes in the plurality of nodes according to the picture drawing feature sub-vector of the current node and the node feature vectors of the other nodes in the plurality of nodes;
Step S330: and determining the dependency characteristic vector of the current node and other nodes in the plurality of nodes according to the dependency relationship of the current node and the other nodes in the plurality of nodes.
Specifically, in the case that the current node in the plurality of nodes is a picture node, the relationship between the picture node and other nodes needs to be specifically judged, by providing a picture drawing feature sub-vector in the node feature vector of the picture node, then determining the dependency relationship between the picture node and other nodes in the plurality of nodes according to the picture drawing feature sub-vector of the picture node and the node feature vector of other nodes in the plurality of nodes, and then determining the current node dependency feature vector according to the dependency relationship between the current node and other nodes in the plurality of nodes, wherein the picture drawing feature sub-vector is used for indicating the related content information of the picture, and when drawing information does not exist in the picture node, the picture drawing feature sub-vector is 0, and at this time, the dependency relationship between the default picture node and other nodes in the plurality of nodes is a linear relationship.
The process of determining the dependency relationship between the current node and the plurality of nodes in the present application will be described in detail with reference to fig. 6.
As shown in fig. 6, step S320 includes step S321, step S322, step S323, and step S324. The following describes step S321 to step S324 in detail. It should be understood that step S320 includes, but is not limited to, step S321 to step S324.
Step S321: extracting semantic features of the picture drawing feature sub-vectors to obtain a picture labeling field;
step S322: extracting the characteristics of the image annotation field and the node characteristic vectors of other nodes in the plurality of nodes to obtain the association values of the image annotation characteristic sub-vector and the node characteristic vectors of other nodes in the plurality of nodes;
step S323: if the association value of the picture drawing feature sub-vector of the current node and the node feature vectors of other nodes in the plurality of nodes exceeds a preset association threshold, determining that the dependency feature vector of the current node and the other nodes in the plurality of nodes is a supplementary relation;
step S324: if the association value of the picture drawing feature sub-vector of the current node and the node feature vectors of other nodes in the plurality of nodes does not exceed the preset association threshold, determining that the dependency feature vectors of the current node and the other nodes in the plurality of nodes are in a linear relationship.
Specifically, in the case where the current node among the plurality of nodes is a picture node, it is necessary to calculate the association value of the picture node with other nodes. The semantic feature extraction can be carried out on the picture annotation feature sub-vector through a convolutional neural network CNN to obtain a picture annotation field, and the picture node can be identified through an OCR (optical character recognition) technology to obtain the picture annotation field. The present application is not particularly limited in this regard. And then calculating through the picture drawing feature sub-vector of the picture node and the node feature vector of a second node, wherein the second node is any node except the current node in the plurality of nodes, when the association value exceeds a preset association threshold value, the current node and the second node are judged to be in strong association, the current node and the second node are in a complementary relationship, and otherwise, the current node and the second node are in a linear relationship by default.
For example: a subtitle node f1, a subtitle node f2 and a picture node g exist, semantic feature extraction or OCR (optical character recognition) is performed on the picture node, so as to obtain drawing information of a relation diagram of A and B of the picture node g, and in this case, the drawing information is used for: and (3) taking the relation diagram of the A and the B as a characteristic point, and carrying out association value calculation on other nodes in the article to obtain association values representing association relation of the picture node g and other nodes. At this time, it is found that the content exists in one subtitle node f 1: "relationship of A and B", the subtitle node f2 has the contents: "relationship of a and C", in this case, the picture node g and the subtitle node f1 are judged to be complementary relationships. In the present application, the correlation value may be obtained by performing feature extraction through a neural network, or may be obtained by other manners, which is not particularly limited in the present application. Such as: in an embodiment, a CNN convolutional neural network model is used for carrying out convolutional operation on drawing information of a picture node and node feature vectors of other nodes to obtain a plurality of association values, then node feature vectors larger than a preset association threshold value are screened out from the plurality of association values, and the relation between the corresponding node and the picture node is determined.
The detailed evaluation process of the article layout evaluation device in the present application will be described in detail with reference to fig. 7.
As shown in fig. 8, in a second aspect, some embodiments of the present application further provide an article layout evaluation apparatus, including: the article acquisition module 100, the node division module 200, the node feature extraction module 300, the dependency feature extraction module 400, and the scoring module 500. Wherein:
the article acquisition module 100 is used for acquiring articles to be evaluated;
the node dividing module 200 is configured to divide nodes of an article to be evaluated to obtain a plurality of nodes, where the plurality of nodes include a title node and an information node, and the information node includes at least one of a picture node, a video node and a text node;
the node feature extraction module 300 is configured to extract a node feature vector of each node in the plurality of nodes, where the node feature vector is used to indicate a feature carried by each node;
the dependency feature extraction module 400 is configured to extract a dependency feature vector of each node in the plurality of nodes, where the dependency feature vector of each node is used to indicate a dependency relationship between each node and other nodes in the plurality of nodes;
the scoring module 500 is configured to determine a layout score of the article to be evaluated according to the node feature vector of each node in the plurality of nodes and the dependency feature vector of each node.
The embodiment of the application provides an article layout evaluation device, which is used for carrying out node division on articles to be evaluated to obtain a plurality of nodes when the articles are evaluated, wherein the plurality of nodes comprise at least one of a title node and a picture node, a video node and a text node, then node feature vectors of each of the plurality of nodes are extracted, dependency feature vectors of each of the plurality of nodes are extracted, wherein the dependency feature vectors are used for indicating the dependency relationship between each node and other nodes in the plurality of nodes, and then layout scores of the articles to be evaluated are determined according to the node feature vectors of each node and the dependency feature vectors of each node. When the article to be evaluated is evaluated, the node characteristic vector of each node is considered, the dependency characteristic vector of each node is considered, the article to be evaluated is evaluated by combining two different angles, and the accuracy of article evaluation is improved.
It should be noted that, in the article layout evaluation device of the embodiment of the present application, the evaluation steps are similar to those of the foregoing article layout evaluation method, and the specific evaluation process refers to the foregoing article layout evaluation method, which is not described herein again.
The embodiment of the disclosure also provides an electronic device, including:
at least one memory;
at least one processor;
at least one program;
the programs are stored in the memory, and the processor executes at least one program to implement the article layout evaluation method described above. The electronic device can be any intelligent terminal including a mobile phone, a tablet personal computer, a personal digital assistant (Personal Digital Assistant, PDA), a vehicle-mounted computer and the like.
The electronic device of the present application will be described in detail with reference to fig. 8.
As shown in fig. 8, fig. 8 illustrates a hardware structure of an electronic device of another embodiment, the electronic device including:
the processor 600 may be implemented by a general-purpose central processing unit (Central Processing Unit, CPU), a microprocessor, an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, etc., for executing related programs to implement the technical solutions provided by the embodiments of the present disclosure;
the Memory 700 may be implemented in the form of a Read Only Memory (ROM), a static storage device, a dynamic storage device, or a random access Memory (Random Access Memory, RAM). The memory 700 may store an operating system and other application programs, and when the technical solutions provided in the embodiments of the present disclosure are implemented by software or firmware, relevant program codes are stored in the memory 700, and the processor 600 invokes a base article layout evaluation method for performing the embodiments of the present disclosure;
An input/output interface 800 for implementing information input and output;
the communication interface 900 is configured to implement communication interaction between the present device and other devices, and may implement communication in a wired manner (e.g. USB, network cable, etc.), or may implement communication in a wireless manner (e.g. mobile network, WIFI, bluetooth, etc.); and
bus 1000 transfers information between the various components of the device (e.g., processor 600, memory 700, input/output interface 800, and communication interface 900);
wherein the processor 600, the memory 700, the input/output interface 800 and the communication interface 900 are communicatively coupled to each other within the device via the bus 1000.
The disclosed embodiments also provide a storage medium that is a computer-readable storage medium storing computer-executable instructions for causing a computer to perform the above-described article layout evaluation method.
The memory, as a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory remotely located relative to the processor, the remote memory being connectable to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The embodiments described in the embodiments of the present disclosure are for more clearly describing the technical solutions of the embodiments of the present disclosure, and do not constitute a limitation on the technical solutions provided by the embodiments of the present disclosure, and as those skilled in the art can know that, with the evolution of technology and the appearance of new application scenarios, the technical solutions provided by the embodiments of the present disclosure are equally applicable to similar technical problems.
It will be appreciated by those skilled in the art that the solutions shown in fig. 1-4 and fig. 6 and 7 are not limiting to the embodiments of the present disclosure, and may include more or fewer steps than shown, or may combine certain steps, or different steps.
The above described apparatus embodiments are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
Those of ordinary skill in the art will appreciate that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof.
The terms "first," "second," "third," "fourth," and the like in the description of the application and in the above figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that in the present application, "at least one (item)" means one or more, and "a plurality" means two or more. "and/or" for describing the association relationship of the association object, the representation may have three relationships, for example, "a and/or B" may represent: only a, only B and both a and B are present, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b or c may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of elements is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including multiple instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, an optical disk, or other various media capable of storing a program.
Preferred embodiments of the disclosed embodiments are described above with reference to the accompanying drawings, and thus do not limit the scope of the claims of the disclosed embodiments. Any modifications, equivalent substitutions and improvements made by those skilled in the art without departing from the scope and spirit of the embodiments of the present disclosure shall fall within the scope of the claims of the embodiments of the present disclosure.

Claims (8)

1. An article layout evaluation method, comprising:
acquiring an article to be evaluated;
the article to be evaluated is subjected to node division to obtain a plurality of nodes, wherein the nodes comprise a title node and an information node, and the information node comprises at least one of a picture node, a video node and a text node;
extracting a node characteristic vector of each node in the plurality of nodes, wherein the node characteristic vector is used for indicating the characteristics carried by each node;
extracting a dependency feature vector of each node in the plurality of nodes, wherein the dependency feature vector of each node is used for indicating the dependency relationship between each node and other nodes in the plurality of nodes;
determining a layout score of the article to be evaluated according to the node feature vector of each node in the plurality of nodes and the dependency feature vector of each node; wherein,
the determining the layout score of the article to be evaluated according to the node feature vector of each node in the plurality of nodes and the dependency feature vector of each node comprises:
performing splicing processing on the node characteristic vector of each node and the dependency characteristic vector of each node to obtain a fusion characteristic vector of each node;
Determining the layout score of the article to be evaluated according to the fusion feature vector of each node in the plurality of nodes;
wherein the determining the layout score of the article to be evaluated according to the fusion feature vector of each node in the plurality of nodes includes:
carrying out graph rolling and pooling processing on the fusion feature vectors of each node by using a first neural network model to obtain first-class feature vectors;
performing loop iteration processing on the fusion feature vector of each node by using a second neural network model to obtain a second class feature vector, wherein the first neural network model and the second neural network model belong to different types of neural network models;
performing splicing processing on the first type of feature vectors and the second type of feature vectors to obtain spliced feature vectors;
and determining the layout scores of the articles to be evaluated according to the spliced feature vectors.
2. The method of claim 1, wherein the extracting the dependency feature vector for each of the plurality of nodes comprises:
under the condition that a current node in the plurality of nodes is the picture node, a picture drawing feature sub-vector of the current node is proposed from a node feature vector of the current node;
Determining the dependency relationship between the current node and other nodes in the plurality of nodes according to the picture drawing feature sub-vector of the current node and the node feature vectors of the other nodes in the plurality of nodes;
and determining the current node dependency feature vector according to the dependency relationship between the current node and other nodes in the plurality of nodes.
3. The method of claim 2, wherein the determining the dependency relationship between the current node and the other nodes of the plurality of nodes based on the picture-annotation feature sub-vector of the current node and the node feature vectors of the other nodes of the plurality of nodes comprises:
extracting semantic features of the picture annotation feature sub-vectors to obtain picture annotation fields;
extracting the characteristics of the picture annotation field and the node characteristic vectors of other nodes in the plurality of nodes to obtain the association values of the picture annotation characteristic sub-vector and the node characteristic vectors of other nodes in the plurality of nodes;
if the association value of the picture drawing feature sub-vector of the current node and the node feature vectors of other nodes in the plurality of nodes exceeds a preset association threshold, determining that the dependency feature vectors of the current node and the other nodes in the plurality of nodes are complementary relations;
And if the correlation value of the picture drawing feature sub-vector of the current node and the node feature vectors of other nodes in the plurality of nodes does not exceed the preset correlation threshold, determining that the dependency feature vectors of the current node and the other nodes in the plurality of nodes are in a linear relationship.
4. A method according to any one of claims 1 to 3, wherein the plurality of nodes comprises a first node, the first node being any one of the plurality of nodes;
and under the condition that the first node is the text node, the node characteristic vector of the first node comprises: character number feature sub-vectors and character position feature sub-vectors;
in the case that the first node is the picture node, the node feature vector of the first node includes: picture drawing feature sub-vectors, picture position feature sub-vectors, picture pixel feature sub-vectors and picture size feature sub-vectors;
in the case that the first node is the video node, the node feature vector of the first node includes: video length feature sub-vectors and video position feature sub-vectors;
in the case that the first node is the header node, the node feature vector of the first node includes: title location feature sub-vector.
5. A method according to any one of claims 1 to 3, wherein the title node comprises: article title node, large title node, and subtitle node.
6. An article layout evaluation device, comprising:
the article acquisition module is used for acquiring articles to be evaluated;
the node dividing module is used for carrying out node division on the article to be evaluated to obtain a plurality of nodes, wherein the nodes comprise a title node and an information node, and the information node comprises at least one of a picture node, a video node and a text node;
the node characteristic extraction module is used for extracting a node characteristic vector of each node in the plurality of nodes, wherein the node characteristic vector is used for indicating the characteristics carried by each node;
a dependency feature extraction module, configured to extract a dependency feature vector of each node in the plurality of nodes, where the dependency feature vector of each node is used to indicate a dependency relationship between each node and other nodes in the plurality of nodes;
the scoring module is used for determining the layout score of the article to be evaluated according to the node characteristic vector of each node and the dependency characteristic vector of each node in the plurality of nodes; wherein,
The scoring module is configured to determine a layout score of the article to be evaluated according to a node feature vector of each node in the plurality of nodes and a dependency feature vector of each node, and includes:
performing splicing processing on the node characteristic vector of each node and the dependency characteristic vector of each node to obtain a fusion characteristic vector of each node;
determining the layout score of the article to be evaluated according to the fusion feature vector of each node in the plurality of nodes;
wherein the determining the layout score of the article to be evaluated according to the fusion feature vector of each node in the plurality of nodes includes:
carrying out graph rolling and pooling processing on the fusion feature vectors of each node by using a first neural network model to obtain first-class feature vectors;
performing loop iteration processing on the fusion feature vector of each node by using a second neural network model to obtain a second class feature vector, wherein the first neural network model and the second neural network model belong to different types of neural network models;
performing splicing processing on the first type of feature vectors and the second type of feature vectors to obtain spliced feature vectors;
And determining the layout scores of the articles to be evaluated according to the spliced feature vectors.
7. An electronic device, comprising:
at least one memory;
at least one processor;
at least one program;
the program is stored in the memory, and the processor executes the at least one program to implement:
the method of any one of claims 1 to 5.
8. A storage medium that is a computer-readable storage medium, wherein the computer-readable storage medium stores computer-executable instructions for causing a computer to perform:
the method of any one of claims 1 to 5.
CN202111044587.2A 2021-09-07 2021-09-07 Article layout evaluation method, apparatus, electronic device and storage medium Active CN113743050B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111044587.2A CN113743050B (en) 2021-09-07 2021-09-07 Article layout evaluation method, apparatus, electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111044587.2A CN113743050B (en) 2021-09-07 2021-09-07 Article layout evaluation method, apparatus, electronic device and storage medium

Publications (2)

Publication Number Publication Date
CN113743050A CN113743050A (en) 2021-12-03
CN113743050B true CN113743050B (en) 2023-11-24

Family

ID=78736599

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111044587.2A Active CN113743050B (en) 2021-09-07 2021-09-07 Article layout evaluation method, apparatus, electronic device and storage medium

Country Status (1)

Country Link
CN (1) CN113743050B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08255063A (en) * 1995-03-16 1996-10-01 Sony Corp Layout evaluating device, layout device, and display device
US6542635B1 (en) * 1999-09-08 2003-04-01 Lucent Technologies Inc. Method for document comparison and classification using document image layout
CN108595407A (en) * 2018-03-06 2018-09-28 首都师范大学 Evaluation method based on the argumentative writing structure of an article and device
CN109933802A (en) * 2019-03-25 2019-06-25 腾讯科技(深圳)有限公司 Picture and text matching process, device and storage medium
CN110427609A (en) * 2019-06-25 2019-11-08 首都师范大学 One kind writing people's composition structure of an article reasonability method for automatically evaluating
CN110598095A (en) * 2019-08-27 2019-12-20 腾讯科技(深圳)有限公司 Method, device and storage medium for identifying article containing designated information
CN111339765A (en) * 2020-02-18 2020-06-26 腾讯科技(深圳)有限公司 Text quality evaluation method, text recommendation method and device, medium and equipment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11151130B2 (en) * 2017-02-04 2021-10-19 Tata Consultancy Services Limited Systems and methods for assessing quality of input text using recurrent neural networks

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08255063A (en) * 1995-03-16 1996-10-01 Sony Corp Layout evaluating device, layout device, and display device
US6542635B1 (en) * 1999-09-08 2003-04-01 Lucent Technologies Inc. Method for document comparison and classification using document image layout
CN108595407A (en) * 2018-03-06 2018-09-28 首都师范大学 Evaluation method based on the argumentative writing structure of an article and device
CN109933802A (en) * 2019-03-25 2019-06-25 腾讯科技(深圳)有限公司 Picture and text matching process, device and storage medium
CN110427609A (en) * 2019-06-25 2019-11-08 首都师范大学 One kind writing people's composition structure of an article reasonability method for automatically evaluating
CN110598095A (en) * 2019-08-27 2019-12-20 腾讯科技(深圳)有限公司 Method, device and storage medium for identifying article containing designated information
CN111339765A (en) * 2020-02-18 2020-06-26 腾讯科技(深圳)有限公司 Text quality evaluation method, text recommendation method and device, medium and equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Deepfeedback Network for recommendation;Ruobing Xie 等;《Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence》;2519-2525 *

Also Published As

Publication number Publication date
CN113743050A (en) 2021-12-03

Similar Documents

Publication Publication Date Title
CN111488931B (en) Article quality evaluation method, article recommendation method and corresponding devices
EP3926531B1 (en) Method and system for visio-linguistic understanding using contextual language model reasoners
CN110851641B (en) Cross-modal retrieval method and device and readable storage medium
CN112182166A (en) Text matching method and device, electronic equipment and storage medium
CN116402063B (en) Multi-modal irony recognition method, apparatus, device and storage medium
CN115115913A (en) Data processing method and device, electronic equipment and storage medium
CN110580288A (en) text classification method and device based on artificial intelligence
CN114519395B (en) Model training method and device, text abstract generating method and device and equipment
CN114358007A (en) Multi-label identification method and device, electronic equipment and storage medium
CN113705313A (en) Text recognition method, device, equipment and medium
CN114897060B (en) Training method and device for sample classification model, and sample classification method and device
CN113515669A (en) Data processing method based on artificial intelligence and related equipment
CN113392179A (en) Text labeling method and device, electronic equipment and storage medium
CN114419515A (en) Video processing method, machine learning model training method, related device and equipment
CN115131811A (en) Target recognition and model training method, device, equipment and storage medium
CN111222000B (en) Image classification method and system based on graph convolution neural network
CN113743050B (en) Article layout evaluation method, apparatus, electronic device and storage medium
CN114491076B (en) Data enhancement method, device, equipment and medium based on domain knowledge graph
CN116955707A (en) Content tag determination method, device, equipment, medium and program product
CN114998041A (en) Method and device for training claim settlement prediction model, electronic equipment and storage medium
CN111445545B (en) Text transfer mapping method and device, storage medium and electronic equipment
CN115186133A (en) Video generation method and device, electronic equipment and medium
CN115204300A (en) Data processing method, device and storage medium for text and table semantic interaction
CN114090778A (en) Retrieval method and device based on knowledge anchor point, electronic equipment and storage medium
CN114627282A (en) Target detection model establishing method, target detection model application method, target detection model establishing device, target detection model application device and target detection model establishing medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant