CN112036350B

CN112036350B - User investigation method and system based on government affair cloud

Info

Publication number: CN112036350B
Application number: CN202010930044.XA
Authority: CN
Inventors: 李旺; 冯正乾; 刘一鸣; 解宏泽; 丁西凯
Original assignee: Shandong Shanke Digital Economy Research Institute Co ltd
Current assignee: Shandong Shanke Digital Economy Research Institute Co ltd
Priority date: 2020-09-07
Filing date: 2020-09-07
Publication date: 2022-01-28
Anticipated expiration: 2040-09-07
Also published as: CN112036350A

Abstract

The invention discloses a user investigation method and a user investigation system based on a government affair cloud, which are used for acquiring information of a person needing to be visited, according to household registration information and the like in the information; generating corresponding inquiry content; acquiring voice and video information of the interviewee terminal, and comparing the voice response information with pronunciation sample information of the region according to the acquired voice response information to obtain a voice similarity parameter; analyzing the video information according to the video information to obtain the emotion classification of the interviewee and obtain an emotion parameter value; and mapping the question and answer content, the voice similarity parameter and the emotion parameter value to form a question and answer table, and sending the question and answer table to a government affair cloud. According to the method and the system, data source credibility discrimination is performed in the early stage of data investigation, the data credibility of a user in the investigation is improved, and the sample noise of post-processing is reduced.

Description

User investigation method and system based on government affair cloud

Technical Field

The invention belongs to the field of government affair clouds, and particularly relates to a user investigation method and system based on the government affair cloud.

Background

In recent years, as a new generation of information technology, cloud computing has been promoted in various countries around the world for ITs research and application, and a lot of work has been done by many government departments and multinational IT enterprises in research, development, application, and the like. The development of cloud computing is highly emphasized in every world, and the cloud computing is regarded as a strategic emerging industry of key development. The rapid development of cloud computing, particularly the development of private cloud, public cloud and mixed cloud, provides various industrial modes to be suitable for various different application scenes, and provides a new operation mode for e-government affairs, and the cloud computing has the characteristics of expandability, high reliability, rapidness, flexibility, autonomous and efficient management, green computing, rapid deployment and higher disaster recovery mechanism, can effectively solve various problems existing in the current e-government affairs, and almost all researches show that the most core problems of the current cloud computing are safety, privacy and reliability.

The government affair cloud is a comprehensive service platform which utilizes the cloud computing technology, utilizes the existing machine room, computing, storage, network, safety, application support, information resources and the like, exerts the characteristics of cloud computing virtualization, high reliability, high universality, high expandability, rapidness, on-demand and elastic service and the like, and provides infrastructure, support software, an application system, information resources, operation guarantee, information safety and the like for the government industry. The government affair cloud is a main business scene used by current cloud computing, and comprises two parts, namely a national electronic government outer network and the Internet, from the network.

The use of the government affair cloud system improves the information processing and big data fusion capability. The problem of how to match the processing capacity of the cloud in the existing government affair system so as to enhance the credibility of the processing efficiency data of the government affair system exists.

Disclosure of Invention

Based on the problems, the invention provides a user survey method and system based on government affair cloud by generating inquiry information according to emotion distinguishing and classification, so as to improve the data reliability of the user during survey execution and reduce the sample noise of post-processing.

The invention provides a user investigation method and system based on a government affair cloud.

A user investigation method based on government affair cloud comprises the following steps: acquiring information of a person needing to be visited, wherein the information data comprises household registration information of the person needing to be visited; generating corresponding inquiry content according to the household address information and/or the permanent address information in the household information;

acquiring voice information and video information of the interviewee, and comparing the voice response information with pronunciation sample information of the region according to the acquired voice response information to obtain a voice similarity parameter;

analyzing the video information according to the video information to obtain the emotion classification of the interviewee and obtain an emotion parameter value;

and mapping the question and answer content, the voice similarity parameter and the emotion parameter value to form a question and answer table, and sending the question and answer table to a government affair cloud, wherein the question and answer content comprises question and answer information and answer content.

Further, the question-answer content can generate text information through voice recognition, the emotion parameter value is obtained, the voice emotion parameter value can be further obtained through the responded text information and the speed and tone in the voice recognition, and the emotion parameter value obtained through the video information is corrected through the voice emotion parameter value;

and further, carrying out weighted adjustment according to the response content, the voice similarity parameter and the emotion parameter value, and obtaining the credibility value of the investigation result of the interviewee.

Further, the generating of the corresponding query information according to the information of the home location and the residential area further includes performing weighting processing to correct the address information according to the information of the access base station of the user mobile terminal to generate the corresponding query information.

Further, the value of the mood parameter of the video information is obtained from the facial features of the user.

Further, the emotional parameter value of the video is obtained from mouth movement information and activity information of an eye region in the facial feature of the user.

Further, the acquiring eye or mouth movement information specifically includes: the pixel intensities in an image acquired at one moment in time are compared with the corresponding pixel intensities in a second image acquired at a subsequent time, the number of pixels that have changed significantly is counted, and if the count is above a threshold, it is determined that the mouth or eyes are in motion.

Further, the emotion classification of the facial features of the interviewee is carried out by adopting a clustering algorithm or a neuron algorithm.

Further, according to the video information, adopting a clustering algorithm to map the face of the identified speaker to a cluster so as to deduce the characteristics of the interviewee, obtaining the class attribute of the interviewee user and extracting the registration information of the interviewee to be matched, verifying the user information, and correcting the voice parameter value.

Acquiring information of a person needing to be visited, wherein the information data comprises household registration information of a user; when generating corresponding inquiry information according to the household address information and/or the permanent address information, the corresponding voice recognition model can be obtained; the corresponding speech recognition model can be selected according to the category information in the video information, and the recognition models of the old and the middle-aged can be obtained.

Furthermore, a county-level region is pre-established as an adjacent region mapping map, and when the adjacent region mapping map selects corresponding response information generated by an adjacent region county, historical region division and/or geographical separation factors are referred to.

Further, the government affair system based on the cloud government affair survey comprises a server and a user terminal, wherein the server comprises a memory and a processor, the processor is located in the memory, and the processor executes a computer program to realize the cloud government affair survey method.

The invention can at least realize one or more of the following beneficial effects, provide adaptive voice question and answer through the address information identification in the household registration information in the government affair system, obtain the attribute class information according to the video, correct the credibility of the voice question and answer content according to the introduction of the emotion classification information, and further correct the emotion information in the video according to the information such as the speed of speech in the voice, thereby further improving the credibility of the survey information of the user and reducing the noise of the post-processing.

Drawings

The features and advantages of the present invention will be more clearly understood by reference to the accompanying drawings, which are illustrative and not to be construed as limiting the invention in any way.

Fig. 1 is a user survey system based on a government cloud.

Detailed Description

These and other features and characteristics of the present invention, as well as the methods of operation and functions of the related elements of structure and the combination of parts and economies of manufacture, will be better understood upon consideration of the following description and the accompanying drawings, which form a part of this specification. It is to be expressly understood, however, that the drawings are for the purpose of illustration and description only and are not intended as a definition of the limits of the invention. It will be understood that the figures are not drawn to scale. Various block diagrams are used in the present invention to illustrate various variations of embodiments according to the present disclosure.

Example 1

As shown in fig. 1, the present invention provides a user survey system based on a government affairs cloud, which is composed of a plurality of visitors and a server, wherein the visitors can be understood as users with intelligent terminals, and the system and the method provide a service flow and a management means for effectively and safely accessing resources, thereby improving the credibility of the users.

The current network inquiry uses a computer terminal as a platform, and an inquiry unit is established on a webpage, so that the interaction of a citizen on the network platform is realized, but the interaction mode cannot solve the problem of monitoring the network inquiry information dynamics of the citizen in real time, and cannot check and reply the network inquiry in time, the credibility of the information is low, the efficiency of network inquiry and government affair investigation is limited, the investigation of an interactive webpage is easy to be faked, and the sample credibility needs to be corrected and de-noised at a later stage.

In view of the fact that the current government cloud system has strong government-related personal information, and based on actual needs of network inquiry and the like, for government affair treatment, government affair policies, policy implementation effects and the like, return visits and sample spot checks need to be performed, as mentioned above, the traditional telephone survey text is only a mode of carrying out surveys and questionnaire surveys on the basis of telephone calls, and the like, and because safety verification is lacked for visited third parties, the burden of later-stage denoising is increased. Therefore, based on the strong data of the government affair cloud system and the popularization of the user intelligent terminal, the investigation system and the investigation method perform identification differentiation of the credibility of the data source at the early stage of data investigation, reduce the pressure of the rear-end treatment of the government affairs and improve the efficiency of the government affair treatment.

The method and the system acquire personal information of a user needing to be randomly extracted and acquired according to a government affair cloud system, respectively select proper voice inquiry verification according to the resident place and the household information of the user, the household information and the regional information of the resident place and different pronunciations in the geographic position, and optionally select inquiry information matched with the household information in the verification information according to the position information of the household information. The household information can cover household address information and permanent address information, and the address information can be the same or different. Optionally, the query information may be obtained in a scene that the distinction degree of the front nasal sound and the rear nasal sound of the pronunciation Sh, S, n, l, r, etc. in some regions is not sufficient. Therefore, labeling is carried out in the information database of the household registration information address information and the common residence, when the user information is obtained, the voice response information of the user is obtained and recognized, the recognized information is compared with the pronunciation sample information of the region, weighting processing is carried out according to the information of the common residence and/or the household registration information, and the reliability value of the investigation of the user is obtained. The generation of the above-mentioned pronunciation query information can be avoided according to the label, or the above-mentioned features can be extracted after the recognition according to the query information to improve the recognition accuracy.

In the existing government affair cloud system, the optional government affair cloud system can extract the geographic base station access information in the user mobile terminal, and the correction is carried out through the information and the frequency of the access site of the mobile base station, so that the correction is provided in the adjacent area. Optionally, when some dialects at province junctions are close to each other and province or city or county is used as a unit for investigation, the division of the current administrative regions is similar and belongs to one region due to the obstruction of geographic regions or the revolution of administrative regions of adjacent counties and the like, but the pronunciation of the actual dialects is far away, so that adaptive adjustment is required for the model recognition of mandarin and the question asking mode and the possible response mode of asking questions, and accordingly, the pronunciation models of adjacent provinces or counties are selected to be more appropriate. Therefore, the access information of the base station where the mobile terminal is located is introduced, and correction is performed when a user in a large area class is investigated, for example, when a user in a county area is investigated, inquiry information is generated by using a dialect model in a neighboring county area. The survey system may be configured to set up a map with a county-level area as an adjacent area in advance, and when the adjacent area map selects the model, the area division and geographic blocking factors or the access frequency coefficient of the base station may be preferentially considered. That is, the neighboring counties in a certain place are more frequently handed over, and the influence of the dialect is also serious, and correspondingly, the signal accessed by the mobile terminal is more frequently related to the base station in the neighboring counties.

Based on the use of the mobile terminal, in order to guarantee the information credibility of the user, the method is based on facial features, the survey information of the user is selected adaptively, a camera of the mobile terminal and the like are used for acquiring a non-voice input information source of an interviewee from a speaker, such as information captured by the camera, and the like, the analyzed information is used for analyzing one or more non-acoustic attributes of the interviewee from information in the area near a sound source; and adjusting the credibility of the voice information based on one or more analyzed non-acoustic attributes, and adjusting the weight value of the voice information in the user survey so as to correct the credibility of the survey source data.

The server, which may include or be directly executed by the server, may optionally include a confidence adaptation module that may further make the selection of a particular face as more trustworthy to the interviewee associated with the voice input by looking for lip movement or movement of eye features in the mouth region. If a face region has been found, depending on the selection of lip motion and extraction of eye features, the pixel intensities in the acquired image may be compared to the corresponding pixel intensities in a second image acquired at a later time, either above or below the lower third of the region. The system will count the number of pixels that have changed significantly and if the count is above a threshold, determine that the mouth or eyes are in motion, extract features to analyze the identified interviewee's class attributes while preventing cheating using static photographs.

The category attributes may include, but are not limited to, attributes of the identified speaker, geographic region information to which it belongs, and facial features, such as age, gender, skin tone, hair color, and the like. Appropriate features of the interviewee, such as the hair color and height of the identified speaker, are extracted for use in the determination of the user type. The identified facial features of the interviewee are mapped to clusters to infer features of the interviewee. Speaker feature extraction and face mapping may be performed using face recognition algorithms and clustering algorithms; and performing matching on the information of the interviewee and the emotion model, so as to mark the emotion of the interviewee when answering corresponding content, and an emotion label, which is happy, low, approved and denied, matching the emotion label with the extracted emotion expression label of the response content of the interviewee, confirming the emotion label as a credible sample or performing weighting processing on the credible sample in a credible interval, wherein a specific execution interval can be selected, the rule of a normal distribution function can be assigned and weighted, and the weighting can be performed by a direct linear weighting or weighted average statistical method. And matching the class attribute identified by the facial features with the basic information of the interviewee associated with the randomly extracted household registration information, judging the credibility of the interviewee, and improving the credibility of the overall investigation.

Taking two emotions as an example, a quick recognition mode can be adopted, and the following steps are adopted,

s1, constructing coordinates corresponding to the face detection feature area in the three-dimensional space by using the obtained video picture;

s2, shooting and extracting a facial feature region to be detected by a camera, constructing a normal adaptive threshold area of the feature region to be detected, storing the threshold area into a feature region database, and classifying facial states according to the recorded threshold area;

s3, dividing the state of the identified interviewee into two categories according to the face threshold area value recorded in unit time;

by adopting curvature analysis, two curvatures can be calculated for each depth information point; the maximum curvature and the minimum curvature are obtained from k1(P) and k2 (P). The calculation formula is as follows:

D＝tan^-1(k1(P)+k2(P))/(k1(P)-k2(P))；

the curvature information of the characteristic region is obtained by the above formula 1 for identification. The method comprises the steps of carrying out frame fusion on the face in the video, obtaining real-time information of the face by adopting a multi-frame fusion algorithm, and simply and quickly obtaining a large pixel gray value by adopting a Kinect algorithm. The eye information has 3 frames of data p₁、p₂、p₃. And taking the pixel with the larger pixel value of the corresponding point as the integrated data. The result p of the three-frame data fusion of the integrated data_max＝{p₁(x，y)，p₂(x，y)，p₃(x, y) }. Comprehensively obtain data p₁₂₃。

The detection characteristic area is a face organ of a human face, the shape constraint and the motion constraint are carried out on the face organ in the detection characteristic area by adopting a Camshift tracking algorithm, and the change threshold value of each organ of the human face is rapidly calculated by using a filter; the Camshift tracking algorithm establishes a coordinate system according to the concave-convex degree of facial organs, automatically prestores each organ of the human face in a visited state, and records the value of the normal adaptive threshold area during the visited state; matching the value with emotion classification information in a cloud database. Thereby performing a classification of the emotional characteristics at the time of the visit.

Taking four types of emotions as an example, the selectable emotion classification mode can also adopt a neuron network method, a one-dimensional column vector is output from the collected image sampling, and input vector standardization is carried out before the column vector is input into the neuron network. Adopting a BP network which is a forward network and comprises an input layer, a hidden layer and an output layer; the hidden layer can be one layer, two layers or even more layers so as to analyze the interaction among all factors, each layer is composed of a plurality of neurons, each neuron of two adjacent layers is connected with a weight, the size of the weight reflects the connection strength between the two neurons, and the calculation process of the whole network is performed in one direction from the input layer to the hidden layer and then to the output layer; alternatively, H1 represents the 1 st hidden layer (i.e., convolutional layer), where a convolutional layer contains multiple convolutional areas, each of which is associated with a convolution filter, and the convolutional area is obtained by performing an inner convolution on the input and the corresponding convolution kernel, and then adding an offset. There are 6 volume areas, each connected to the 5 x 5 neighborhood of inputs, so there are 156 trainable parameters, which in the process yields a total of 122304 connections. The H2 layer is a downsampling layer consisting of 6 14 × 14 downsampling planes, each cell of the downsampling plane being connected to a 2 × 2 neighborhood of the last volume, and has 12 trainable parameters and 5880 connections. Layer H3 is also a convolutional layer, deconvoluted with a 5 × 5 convolutional kernel H2 with 10 × 10 neurons per convolutional area, for a total of 16 convolutional areas. The H4 layer is a down-sampling layer, and is also composed of 16 5 × 5 down-sampling planes, each unit in the down-sampling plane is connected with the 2 × 2 neighborhood of the previous H3 volume plane, and there are 32 trainable parameters and 2000 trainable connections. The H5 layer is a convolutional layer and consists of 32 convolutional surfaces, each unit is connected with a 5 multiplied by 5 downsampling surface of all the units of the H4 layer, so the size of the convolutional surface of H5 is 1 multiplied by 1, namely, each convolutional surface of H5 only contains 1 neuron, the H4 layer and the H5 layer are fully connected, and the H5 layer is still defined as a convolutional layer because the two layers cannot be guaranteed to be the same. The H6 layer has 18 neurons, and is fully connected with the H5 layer to form a fully connected layer, and the invention uses a Radial Basis Function (RBF) RBF activation Function to classify the neurons for characterizing the facial expression characteristics of the interviewee. The output layer selects a softmax function to set output, 4 neurons are obtained, and the four neurons respectively represent four facial expressions of the interviewee.

The server adjusts the confidence level according to the analyzed category attributes of the identified interviewees. The acoustic model may be selected according to the determined demographic survey sample, and the model characterized by a voiceprint may be selected based on a determination that the identified interviewee is elderly. The query information speech model is changed in accordance with the determined location information of the identified interviewee. For example, if the visitor identifies a certain area, an appropriate language model is selected to adapt to the words, word choices or dialect idioms specific to the speaker of the area, and corresponding question-answering information or question-asking information is constructed, thereby improving the accuracy of the speech recognition system. The language model may be selected from a library of acoustic models and language models maintained in a database. The database may be part of or separate from the speech recognition system or model.

The government affair cloud or the server extracts facial feature information of the user according to information, voice input information and the like of a camera of the user, the questions and answers and emotion recognition contents are mapped to form a question and answer table through the facial feature information of the user and according to recognition samples in a system library, and weighting processing is carried out on the investigation information credibility of the user according to the recognized emotion information and the like.

Optionally, the question and answer content, the voice similarity parameter and the emotion parameter value are mapped to form a question and answer table, the question and answer table is sent to the government affair cloud, the question and answer content is composed of the question and answer information and the answer table comprises the ID identification information of the user.

The method of the embodiment can be completed by a user survey system based on the government affair cloud, wherein the user survey system based on the cloud government affair comprises a server and a user terminal, the server comprises a memory and a processor, the processor is in a memory computer program, and the processor executes a computing program to realize the survey method based on the cloud government affair.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a Flash Memory (Flash Memory), a Hard Disk (Hard Disk Drive, abbreviated as HDD), a Solid State Drive (SSD), or the like; the storage medium may also comprise a combination of memories of the kind described above.

As used in this application, the terms "component," "module," "system," and the like are intended to refer to a computer-related entity, either hardware, firmware, a combination of hardware and software, or software in execution. For example, a component may be, but is not limited to being: a process running on a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of example, both an application running on a computing device and the computing device can be a component. One or more components can reside within a process and/or thread of execution and a component can be localized on one computer and/or distributed between two or more computers. In addition, these components can execute from various computer readable media having various data structures thereon. The components may communicate by way of local and/or remote processes such as in accordance with a signal having one or more data packets (e.g., data from one component interacting with another component in a local system, distributed system, and/or across a network such as the internet with other systems by way of the signal).

It should be noted that the above-mentioned embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, which should be covered by the claims of the present invention.

Claims

1. A user investigation method based on government affair cloud is characterized in that:

acquiring information of a person needing an interviewee, wherein the information comprises household registration information of the interviewee; generating corresponding inquiry information according to the household registration address information and the common living address information in the household registration information;

acquiring voice response information and video information of the visited person terminal, and comparing the voice response information with pronunciation sample information of a region corresponding to the household registration information according to the acquired voice response information to obtain a voice similarity parameter;

mapping the question and answer content, the voice similarity parameter and the emotion parameter value to form a question and answer table, and sending the question and answer table to a government affair cloud, wherein the question and answer content comprises question and answer information;

weighting and adjusting the reliability value of the investigation result of the interviewee according to the response content, the voice similarity parameter and the emotion parameter value;

comparing the voice response information with pronunciation sample information of a region corresponding to the household registration information to obtain a voice similarity parameter; according to the video information, adopting a clustering algorithm to map the identified facial features of the interviewee to clusters so as to infer the feature information of the interviewee, obtaining the matching of the category attribute of the interviewee and the extracted household registration information of the interviewee, verifying the authenticity of the interviewee and correcting the voice similarity parameter value;

wherein the emotion parameter values include emotion parameter values of a video and emotion parameter values of a voice, wherein the emotion parameter values of the video are obtained from mouth movement information and activity information of an eye region in facial features of the interviewee; the acquiring of the eye or mouth movement information specifically comprises: comparing pixel intensities in an image acquired at one moment with corresponding pixel intensities in a second image acquired at a subsequent time, counting the number of pixels that have changed, and if the count is above a threshold, determining that the mouth or eye is in motion;

wherein, the step of adjusting the credibility value of the interviewee survey result in a weighting manner according to the response content, the voice similarity parameter and the emotion parameter value comprises the following steps: marking in a database of the household registration information address information and the common household registration information, when the information of the visited person is obtained, obtaining the voice response information of the visited person and identifying, comparing the identified information with the pronunciation sample information of the region corresponding to the household registration information, and performing weighting processing together according to the information of the common household registration address and the household registration information address to obtain the credibility value of the investigation result of the visited person.

2. The method according to claim 1, wherein the emotion parameter value of the voice is obtained from the text information recognized by the voice response information and the speech rate and intonation obtained by the voice recognition, and the emotion parameter value obtained from the video information is corrected by the emotion parameter value of the voice.

3. The method of claim 1, wherein the confidence value of the interviewee findings is adjusted based on the responsive content and the speech similarity parameter and the mood parameter value in a weighted manner.

4. The method of claim 1, wherein generating the corresponding query message according to the household address information and the common living address information, further comprising correcting the address information of the visited person according to the information of the visited person mobile terminal accessing the base station to generate the corresponding query message.

5. The method of claim 1, wherein the mood parameter value of the video information is obtained from facial features of the interviewee.

6. The method of claim 5, wherein the speech recognition is adapted to select different speech recognition models depending on the category information attribute of the identified interviewee in the video information.

7. A government system based on cloud government investigation, characterized in that the system comprises a server and a visitor terminal, the server comprises a memory and a processor, the memory stores a computer program, and the computer program is executed by the processor to realize the method of any one of claims 1 to 6.