CN111639200B - Data report generation method and system based on artificial intelligence - Google Patents

Data report generation method and system based on artificial intelligence Download PDF

Info

Publication number
CN111639200B
CN111639200B CN202010477594.0A CN202010477594A CN111639200B CN 111639200 B CN111639200 B CN 111639200B CN 202010477594 A CN202010477594 A CN 202010477594A CN 111639200 B CN111639200 B CN 111639200B
Authority
CN
China
Prior art keywords
index
key
image
indexes
identifier
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010477594.0A
Other languages
Chinese (zh)
Other versions
CN111639200A (en
Inventor
崔炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Yixue Education Technology Co Ltd
Original Assignee
Shanghai Squirrel Classroom Artificial Intelligence Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Squirrel Classroom Artificial Intelligence Technology Co Ltd filed Critical Shanghai Squirrel Classroom Artificial Intelligence Technology Co Ltd
Priority to CN202010477594.0A priority Critical patent/CN111639200B/en
Publication of CN111639200A publication Critical patent/CN111639200A/en
Application granted granted Critical
Publication of CN111639200B publication Critical patent/CN111639200B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/438Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/432Query formulation
    • G06F16/433Query formulation using audio data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/435Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/435Filtering based on additional data, e.g. user or group profiles
    • G06F16/436Filtering based on additional data, e.g. user or group profiles using biological or physiological data of a human being, e.g. blood pressure, facial expression, gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/177Editing, e.g. inserting or deleting of tables; using ruled lines
    • G06F40/18Editing, e.g. inserting or deleting of tables; using ruled lines of spreadsheets
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/186Templates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Abstract

The invention provides a data report generation method and system based on artificial intelligence. The method comprises the following steps: determining key indexes concerned by a report reader; acquiring a form report template, and writing the index identifier of the key index into an index identifier filling area to be filled in the form report template; determining other indexes except the key indexes in the form report template; screening out index data corresponding to the key indexes from a database; screening out index data corresponding to the other indexes from the database; and writing the index data corresponding to the key indexes into the index data import areas corresponding to the key indexes, and writing the index data corresponding to other indexes into the index data import areas corresponding to other indexes, thereby generating a data report.

Description

Data report generation method and system based on artificial intelligence
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a data report generation method and system based on artificial intelligence.
Background
At present, data reports are indispensable work of products and operators, and are surrounded by weekly reports, monthly reports, product market analysis, new commodity performance and the like. However, it takes much time and effort to refine the data arrangement and analysis, and thus, a method for generating data reports intelligently is urgently needed.
Disclosure of Invention
The invention provides a method and a system for generating a data report in artificial intelligence.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows:
additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is a flow chart of a method for generating data reports based on artificial intelligence in an embodiment of the present invention;
FIG. 2 is a block diagram of an artificial intelligence based data report generation system according to an embodiment of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.
An embodiment of the present invention provides a data report generating method based on artificial intelligence, which may be implemented by a computer program, as shown in fig. 1, and includes steps S1-S6:
and step S1, determining the important indexes of the report which are focused by the reader.
The report reader refers to a person who uses the report, such as a company president, or a product manager. Different report readers will focus on different key indicators.
The important indexes which determine the attention of the report reader can be implemented as important indexes which are input by a user, for example, a data input interface is provided, and the user inputs the important indexes through the interface; alternatively, the key indicators may be obtained by the following intelligent means, and in this case, the step S1 may be implemented as the following steps a1-a10:
step A1, synchronously recording the facial image and sound of the report reader when participating in the conference, and obtaining the facial video and sound recording of the report reader when participating in the conference;
wherein, the conversation refers to the conversation including report readers, such as the speaking of the report readers on a conference;
step A2, performing character recognition on the sound recording to obtain a conversation text;
a3, semantic refining the conversation text to obtain key words in the conversation text and the occurrence frequency of each key word in the audio file;
a4, acquiring the volume corresponding to each keyword every time from the audio file;
step A5, performing facial expression recognition analysis on the image in the facial video, and determining a time point when a preset expression (such as expression of happy, excited, sad and the like) appears in the facial video; determining keywords appearing in a preset time period; the preset time period is a time period from a first time point before the time point to a second time point after the time point;
step A6, determining the importance index corresponding to each keyword according to the number of times of each keyword appearing in the audio file, the volume corresponding to each keyword appearing, and whether each keyword appears in the preset time period:
step A7, arranging the importance indexes corresponding to all keywords in a descending order to obtain a keyword importance sequence;
a8, selecting N keywords corresponding to the importance indexes with N top ranks from the keyword importance sequence;
step A9, determining indexes corresponding to the N keywords according to a prestored index and keyword comparison table;
step A10, taking indexes corresponding to the N keywords as key indexes concerned by the report reader;
determining the importance index corresponding to each keyword according to the number of times of occurrence of each keyword in the audio file, the volume corresponding to each keyword in each occurrence and whether each keyword occurs in the preset time period, wherein the determining the importance index corresponding to each keyword comprises:
calculating the importance index corresponding to the current keyword according to the following formula:
Figure BDA0002516327380000031
wherein d isjIs the importance index, α, of the jth keyword1Is a preset first weight coefficient, the value of which is more than 0 and less than or equal to 1; f. ofjIndicating whether the jth keyword appears in the preset time period or not, and if yes, fjA value of 1, when not fjThe value is 0; alpha is alpha2Is a preset second weight coefficient, the value of which is more than 0 and less than or equal to 1; riIs the number of times that the ith keyword appears in the audio file, i ═ 1,2,3, N; n is the number of all keywords appearing in the audio file; rjIs the total number of occurrences of the jth keyword in the audio file; alpha is alpha3Is a preset third weight coefficient, the value of which is more than 0 and less than or equal to 1; x isiIs the maximum volume, X, of the ith keyword in the audio filejIs the maximum volume of the jth keyword in the audio file; wherein alpha is123=1。
When the step S1 is implemented as the above steps a1-a9, the following advantages are obtained: the intelligent analysis method has the advantages that key indexes interested by the report readers can be intelligently and automatically analyzed according to the speeches of the report readers in the conference (such as multi-person or single-person speech scenes of conferences, speeches, training and the like), the indexes interested by the report readers in the conference do not need to be manually recorded, and the intelligent report generation efficiency is improved.
And step S2, acquiring the form report template, and writing the index mark of the key index into the index mark filling area to be filled in the form report template.
In step S3, in the form report template, other indexes except the highlight index are specified.
Wherein, the other indexes can be indexes written in the template in advance;
and step S4, screening out index data corresponding to the key indexes from the database.
And step S5, screening index data corresponding to other indexes from the database.
Wherein the database comprises any one or more of a local database, a network-side database and a third-party cooperative database.
Step S6 is to write index data corresponding to the key index into an index data lead-in area corresponding to the key index, and write index data corresponding to another index into an index data lead-in area corresponding to another index, thereby generating a data report.
The beneficial effects of the above technical scheme are as follows: the data report can be intelligently generated according to the indexes focused by the report reader, the report does not need to be filled manually, the intellectualization and the speed of the data report generation are improved, and the manpower is saved.
In one embodiment, the method further comprises:
matching the important keywords identified in the audio file obtained by the recording of the conversation with the picture and text description information in the database, and according to the matching result, executing the operation of writing the index data corresponding to the key indexes into the index data import area corresponding to the key indexes; the method comprises the following concrete implementation steps:
step S1, according to a pre-established audio preprocessing model, sampling and coding an audio file obtained by recording conversation, determining an output value of the audio file expanded by short-time Fourier transform, matching the obtained output value of audio data with a character database, and obtaining conversation text information corresponding to the audio file;
Figure BDA0002516327380000051
wherein N is the number of the conversation recording audio files, e is a natural constant, i is the total duration of the conversation recording audio files, w is the audio frequency, s is the sequence number of each sentence in the conversation recording audio files, and the value is positive and integralThe number, t is the initial time coordinate of each sentence in the audio file, tau is the ending time coordinate of each sentence in the audio file, m is the total number of texts in the text database, r is the audio index serial number corresponding to each text in the text database, A is the ordered arrangement from large to small,
Figure BDA0002516327380000052
the ordered arrangement information of each text and corresponding audio index in the text database,
Figure BDA0002516327380000053
is a phase unwrapping performed on the start time coordinate t and the end time coordinate tau per sentence in a given audio file,
Figure BDA0002516327380000054
obtaining the meeting text information corresponding to the meeting conversation recording audio file;
step S2, extracting keywords in the conversation text according to the conversation text information corresponding to the conversation talk recording audio file acquired in the step S1, and comparing the keywords with important keywords in a preset database to acquire a required important audio keyword set;
Figure BDA0002516327380000055
wherein y is the number of keywords in the preset database, HyOrderly ordering important database keywords corresponding to the preset number y of the keywords in the database, namely, the important database keywords are ordered in the top 30%, k is the occurrence frequency of each text keyword in the recording audio file of the conversation, BkOrdering the text keywords in the audio file according to the occurrence frequency corresponding to the occurrence frequency k of each text keyword in the audio file of the conversation, wherein the top 60 percent of the ordering is the important text keyword, Z is the amplitude of the audio data, (Z)max-Zmin) The difference between the maximum amplitude and the minimum amplitude of the identified audio data,
Figure BDA0002516327380000061
sound amplitudes corresponding to each text keyword, wherein the sound amplitudes exceed a difference between a maximum amplitude and a minimum amplitude of the audio data
Figure BDA0002516327380000062
In order to be an important text keyword,
Figure BDA0002516327380000063
ipt (B) is a keyword that is recognized as an important text keyword according to the volume and the number of occurrences, while satisfying important database keywords in order to identify whether the keyword is an important audio keyword in a recorded audio file of a conversationk) To obtain the desired set of important audio keywords.
The technical scheme can also realize intelligent and automatic filling of index data and improve report generation efficiency.
When the other indexes include the target index, the target index is the index identifier including only two characters of "photo" or only two characters of "picture" or only two characters of "photo" or only two characters of "image", and at this time, the step S5 may be implemented as the following steps B1-B3:
step B1, determining whether the target index and the key index have an association relation according to the index data filling area corresponding to the key index in the form report template and the index data filling area corresponding to the target index in the form report template; specifically, when the corresponding index identifier filling area of the key index in the form type report template and the corresponding index identifier filling area of the target index in the form type report template are located on the same index identifier row of the form type report template or on the same index identifier column, and the corresponding index data filling area of the key index in the form type report template and the corresponding index data filling area of the target index in the form type report template are located on the same index data row of the form type report template or on the same index data column, it is indicated that the target index and the key index have an association relationship; for example, as shown in fig. 2, the key index is "sales of sports shoes", and one of the other indexes is "picture", the two indexes are located in the same index identification row in the form type report template, and the corresponding index data filling areas of the two indexes are located in the same index data row in the form type report template, so that the two indexes have an association relationship;
step B2, when the target index and the key index have the correlation, obtaining the key image identification corresponding to the key index from the index identification of the key index, wherein the key image identification is the image identification of the key object to be included in the photo, the picture, the photo or the image associated with the key index; for example, when the key index is 'sports shoe sales', key image identifiers 'sports shoes' and 'sales' can be extracted from the key index;
step B3, screening out preset objects containing the images corresponding to the key image identifications from the database as index data corresponding to the target indexes; the preset object comprises a photo, a picture, a photo or an image; for example, when the key character identifier is "sports shoes", then the preset object containing the key character identifier may be a photograph, picture, photograph or image containing the character of the sports shoes;
accordingly, in step S6, "writing the index data corresponding to the other index into the index data lead-in area corresponding to the other index to generate the data report" may be implemented as: and after writing the preset object of the image corresponding to the key image identifier into the corresponding index data import area of the target index, modifying the index identifier of the target index in the table format template into a target identification character, wherein the target identification character is used for indicating that the data written into the corresponding index data import area of the target index is a photo, a picture, a photo or an image related to the key index.
When the above step S5 is implemented as steps B1-B3, there are the following advantageous effects: when the tabular template includes only the index mark (i.e. the target index mark) of the words eye (e.g. photo, picture, photo or image) related to the image, the data pointed by the index is wide, so that the computer program cannot know what kind of content needs to be filled in the index data filling area corresponding to the index, and the index data corresponding to the index cannot be filled in, or filling a picture in the index data filling area corresponding to the target index at any time causes no reference value to the report reader Pictures, photos or images) so that the final report content has reference value for the report readers, the intelligent processing of screening and filling of the target index data is realized, the intelligence and effectiveness of data report generation are improved, and the use value of the report is improved.
In one embodiment, the step B3 "selecting a preset object containing a key character identifier from the database as the index data corresponding to the target index" can be implemented as the steps B31-B35:
step B31, acquiring employee identification of the report reader, wherein the employee identification is used for indicating the job level of the report reader in the work institution where the report reader is located;
step B32, according to a first corresponding relation table between each employee and the managed object preset by the working mechanism, determining the managed object corresponding to the report reader according to the employee identification of the report reader; the managed object may be either a person or an object or both;
step B33, according to a second corresponding relation table between each managed object and the image identification preset by the working mechanism, determining the image identification corresponding to the managed object corresponding to the report reader;
when the managed object is a person with next staff working hours, the image identifier corresponding to the managed object can be a character feature such as a clothing feature of the next staff; when the managed object is an object such as a sports shoe, the image identifier corresponding to the managed object can be an identification feature of the object such as an identification feature of the sports shoe;
step B34, judging whether there is an association between the image mark corresponding to the managed object corresponding to the report reader and the key image mark;
specifically, when the character identifier corresponding to the managed object is a first character feature and the key character identifier includes a second character feature, judging whether the similarity between the two character features is equal to or greater than a first preset similarity, and if so, judging that the two character features have an association relationship;
when the image identifier corresponding to the managed object is the identification feature of the first object and the key image identifier comprises the identification feature of the second object, judging whether the similarity between the identification features of the two objects is equal to or greater than a second preset similarity or not, and if so, judging that the two objects have an association relation;
when the image corresponding to the managed object is identified as a character feature and the key image identification comprises an identification feature of an object, analyzing whether the character feature and the identification feature of the object exist in a preset number of images simultaneously in an image database through big data according to images (including photos, pictures, photos or images) in a pre-stored image database, and if so, judging that the character feature and the identification feature of the object exist in an association relationship; each of the preset number of images includes the character feature and the identification feature of the feature;
when the image identifier corresponding to the managed object is the identification feature of the object and the key image identifier comprises the character feature, analyzing whether the character feature and the identification feature of the object exist in a preset number of images simultaneously or not through big data according to the images (including photos, pictures, photos or images) in a pre-stored image database, and if so, judging that the character feature and the identification feature of the object exist in an association relationship; each of the preset number of images includes the character feature and the identification feature of the feature;
step B35, when the judgment result in the step B34 shows that the association exists, adding the image identifier corresponding to the managed object corresponding to the report reader into the key image identifier to form the adjusted key image identifier; screening out photos, pictures, photos or images containing the image corresponding to the adjusted key image identifier from a database, and taking the photos or images as index data corresponding to the target index;
and when the judgment result in the step B34 is that no association exists, keeping the key image identifier unchanged, and screening out a photo, a picture, a photo or an image containing the image corresponding to the key image identifier from the database as index data corresponding to the target index.
The implementation of the above step B3 as the steps B31-B35 can achieve the following advantages: when the index data corresponding to the target index is screened, the index data with higher correlation degree with the blind range managed by the report reader can be screened out, so that the final report content has higher reference value and use value for the report reader.
In one embodiment, in step S4, when the index attribute identification flag corresponding to the key indicator in the form template (as shown in fig. 2, a horizontal row represents the index, and a vertical column represents the index, for example, 2019 is an index attribute of the key indicator "sports shoe sales"), is an image class flag, the image class flag includes only two characters of "photo" or only two characters of "picture" or only two characters of "photo" or only two characters of "image":
step S4 "screening out the index data corresponding to the key index from the database" may be implemented as the following steps:
acquiring a key image identifier corresponding to a key index from an index identifier of the key index, wherein the key image identifier is an image identification identifier of a key object required to be contained in a photo, a picture, a photo or an image associated with the key index;
screening out a preset object containing an image corresponding to the key image identifier from a database, and taking the preset object as index data of the key index under the attribute of the image identifier; the preset object comprises a photo, a picture, a photo or an image.
In one embodiment, the screening out the preset objects including the key character identifier from the database as the index data of the key index under the attribute of the image class identifier includes steps C31-C35:
step C31, acquiring employee identification of the report reader, wherein the employee identification is used for indicating the job level of the report reader in the work mechanism where the report reader is located;
step C32, according to a first corresponding relation table between each employee and the managed object preset by the working mechanism, determining the managed object corresponding to the report reader according to the employee identification of the report reader; the managed object may be either a person or an object or both;
step C33, according to the second corresponding relation table between each managed object and its image mark preset by the working mechanism, determining the image mark corresponding to the managed object corresponding to the report reader;
step C34, judging whether there is an association between the image mark corresponding to the managed object corresponding to the report reader and the key image mark;
specifically, when the character identifier corresponding to the managed object is a first character feature and the key character identifier includes a second character feature, judging whether the similarity between the two character features is equal to or greater than a first preset similarity, and if so, judging that the two character features have an association relationship;
when the image identifier corresponding to the managed object is the identification feature of the first object and the key image identifier comprises the identification feature of the second object, judging whether the similarity between the identification features of the two objects is equal to or greater than a second preset similarity or not, and if so, judging that the two objects have an association relation;
when the image corresponding to the managed object is identified as a character feature and the key image identification comprises an identification feature of an object, analyzing whether the character feature and the identification feature of the object exist in a preset number of images simultaneously in an image database through big data according to images (including photos, pictures, photos or images) in a pre-stored image database, and if so, judging that the character feature and the identification feature of the object exist in an association relationship; each of the preset number of images includes the character feature and the identification feature of the feature;
when the image identifier corresponding to the managed object is the identification feature of the object and the key image identifier comprises the character feature, analyzing whether the character feature and the identification feature of the object exist in a preset number of images simultaneously or not through big data according to the images (including photos, pictures, photos or images) in a pre-stored image database, and if so, judging that the character feature and the identification feature of the object exist in an association relationship; each of the preset number of images includes the character feature and the identification feature of the feature;
step C35, when the judgment result of the step C34 shows that the incidence relation exists, adding the image identification corresponding to the managed object corresponding to the report reader into the key image identification to form the adjusted key image identification; screening out photos, pictures, photos or images containing the images corresponding to the adjusted key image identifications from a database, and taking the photos or images as index data of the key indexes under the attribute of the image identification;
when the judgment result of the step C34 is that there is no association, the key image identifier is kept unchanged, and a photo, a picture, a photograph or an image including the image corresponding to the key image identifier is screened out from the database, and is used as the index data of the key index under the attribute of the image class identifier.
The step of screening out the preset object containing the key image identifier from the database, and implementing the index data as the key index under the attribute of the image class identifier as the step C31-C35 can achieve the following beneficial effects: when the index data of the key indexes under the attribute of the image class identification is screened, the index data with higher correlation degree with the blind range managed by the report reader can be screened out, so that the final report content has higher reference value and use value for the report reader.
In one embodiment, the screening out the index data corresponding to the key index from the database further includes matching the key word identified in the audio file obtained from the recording of the conversation with the picture and the text description information in the database, and according to the matching result, performing an operation of writing the index data corresponding to the key index into the index data import area corresponding to the key index; the method comprises the following concrete implementation steps:
step S1, according to a pre-established audio preprocessing model, sampling and coding an audio file obtained by recording conversation, determining an output value of the audio file expanded by short-time Fourier transform, matching the obtained output value of audio data with a character database, and obtaining conversation text information corresponding to the audio file;
Figure BDA0002516327380000121
wherein N is the number of the conversation recording audio files, e is a natural constant, i is the total duration of the conversation recording audio files, w is an audio frequency, s is the sequence number of each sentence in the conversation recording audio files, the value of s is a positive integer, t is the start time coordinate of each sentence in the audio files, tau is the termination time coordinate of each sentence in the audio files, m is the total number of texts in a text database, r is the audio index sequence number corresponding to each text in the text database, and A is the ordered arrangement from large to small,
Figure BDA0002516327380000122
the ordered arrangement information of each text and corresponding audio index in the text database,
Figure BDA0002516327380000123
is the start time coordinate t andthe phase unwrapping performed on the time coordinate tau is terminated,
Figure BDA0002516327380000124
obtaining the meeting text information corresponding to the meeting conversation recording audio file;
step S2, extracting keywords in the conversation text according to the conversation text information corresponding to the conversation talk recording audio file acquired in the step S1, and comparing the keywords with important keywords in a preset database to acquire a required important audio keyword set;
Figure BDA0002516327380000125
wherein y is the number of keywords in the preset database, HyOrderly ordering important database keywords corresponding to the preset number y of the keywords in the database, namely, the important database keywords are ordered in the top 30%, k is the occurrence frequency of each text keyword in the recording audio file of the conversation, BkOrdering the text keywords in the audio file according to the occurrence frequency corresponding to the occurrence frequency k of each text keyword in the audio file of the conversation, wherein the top 60 percent of the ordering is the important text keyword, Z is the amplitude of the audio data, (Z)max-Zmin) The difference between the maximum amplitude and the minimum amplitude of the identified audio data,
Figure BDA0002516327380000126
sound amplitudes corresponding to each text keyword, wherein the sound amplitudes exceed a difference between a maximum amplitude and a minimum amplitude of the audio data
Figure BDA0002516327380000127
In order to be an important text keyword,
Figure BDA0002516327380000128
for identifying whether keywords in the recorded audio file of the conversation are important audio keywords, i.e. both satisfied as important database keywordsWords satisfying keywords identified as important text keywords by volume and appearance, Ipt (B)k) To obtain a set of important audio keywords;
step S3, according to the important keyword sequence information obtained in the step S2, pictures, text descriptions and the like corresponding to the keywords are screened out from a database and used as index data corresponding to target indexes, and the index data corresponding to the important indexes are written into the index data import area corresponding to the important indexes;
Figure BDA0002516327380000131
wherein, X is the number of the regions in which the index data corresponding to the key index is to be imported, phi is the sequence number of each picture in the database, the value of phi is a positive integer, theta is the sequence number of each text description in the database, the value of theta is a positive integer, i is the total number of the pictures contained in the database, and L is the total number of the pictures in the databaseiThe total number of pictures contained in the database is the picture keyword information corresponding to i, j is the total number of text descriptions contained in the database, LiThe text description keyword information corresponding to the total number j of the text descriptions contained in the database is (B)k-φLi)2(Bk-θOj)2In order to match the picture and text descriptions in the database with the important keywords identified in the audio file of the conversation recording, rep (x) is a matching result of the important keywords identified in the audio file of the conversation recording and the picture and text description information in the database, when rep (x) is not 0, the matching result indicates that the important keywords identified in the audio file of the conversation recording are matched with the picture and text description information in the database, and the operation of writing the index data corresponding to the key indexes into the index data import area corresponding to the key indexes is executed.
The beneficial effects of the above technical scheme are: according to the technical scheme, the keyword information in the talk of the target object can be automatically extracted according to the talk conversation recording file, the number of times of keyword mention and the change situation of volume can be automatically counted to carry out intelligent sequencing on important keywords, corresponding information such as pictures and text descriptions can be extracted according to the important keywords in the preset database, and a data report aiming at the requirement of the target object can be rapidly generated.
In an embodiment, an embodiment of the present invention further provides an artificial intelligence based data report generating system, including:
the first determining module is used for determining key indexes concerned by the report reader;
the acquisition module is used for acquiring a form report template and writing the index identifier of the key index into an index identifier filling area to be filled in the form report template;
the second determining module is used for determining other indexes except the key indexes in the form type report template;
the first screening module is used for screening out index data corresponding to the key indexes from a database;
the second screening module is used for screening the index data corresponding to the other indexes from the database;
and the import module is used for writing the index data corresponding to the key indexes into the index data import areas corresponding to the key indexes and writing the index data corresponding to other indexes into the index data import areas corresponding to other indexes, so as to generate a data report.
In one embodiment, the database includes any one or more of a local database, a network-side database, and a third-party collaboration database.
In one embodiment, the first determining module is further configured to:
recording the conversation of the conversation to obtain an audio file;
constructing an audio recognition model, and inputting the audio file into the audio recognition model to obtain a conversation text;
semantic extraction is carried out on the conversation text, and keywords in the conversation text and the occurrence frequency of each keyword in the audio file are obtained;
acquiring the volume corresponding to each keyword every time from the audio file;
calculating the importance index corresponding to each keyword according to the following formula:
arranging the importance indexes corresponding to all the keywords in a descending order to obtain a keyword importance sequence;
selecting N keywords corresponding to the importance indexes of N top-ranked importance indexes from the keyword importance sequences;
determining indexes corresponding to the N keywords respectively according to a prestored index and keyword comparison table;
and taking indexes corresponding to the N keywords as key indexes concerned by the report reader.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (7)

1. A method for generating data reports based on artificial intelligence, comprising:
determining key indexes concerned by a report reader;
acquiring a form report template, and writing the index identifier of the key index into an index identifier filling area to be filled in the form report template;
determining other indexes except the key indexes in the form report template;
screening out index data corresponding to the key indexes from a database;
screening out index data corresponding to the other indexes from the database;
writing the index data corresponding to the key indexes into the index data import areas corresponding to the key indexes, and writing the index data corresponding to the other indexes into the index data import areas corresponding to the other indexes, thereby generating a data report;
when the other indexes comprise the target index, the target index is the index identification which only comprises two characters of 'photo', or only comprises two characters of 'picture', or only comprises two characters of 'photo', or only comprises two characters of 'image':
the screening out the index data corresponding to the other indexes from the database comprises the following steps:
determining whether the target indexes and the key indexes have an association relationship or not according to corresponding index data filling areas of the key indexes in the form type report template and corresponding index data filling areas of the target indexes in the form type report template;
when the target index and the key index have an association relationship, acquiring a key image identifier corresponding to the key index from the index identifier of the key index, wherein the key image identifier is an image identification identifier of a key object to be included in a photo, a picture, a photo or an image associated with the key index;
screening out a preset object containing an image corresponding to the key image identifier from a database, and taking the preset object as index data corresponding to the target index; the preset object comprises a photo, a picture, a photo or an image;
in this case, the writing of the index data corresponding to the other index into the index data lead-in area corresponding to the other index includes:
and after writing the preset object of the image corresponding to the key image identifier into the corresponding index data import area of the target index, modifying the index identifier of the target index in the table format template into a target identification character, wherein the target identification character is used for indicating that the data written into the corresponding index data import area of the target index is a photo, a picture, a photo or an image related to the key index.
2. The method of claim 1, comprising:
the database comprises any one or more of a local database, a network-side database and a third-party cooperative database.
3. The method of claim 1, wherein said determining a highlight indicator of interest to a report reader comprises:
synchronously recording the facial image and the sound of the report reader when the report reader participates in the conference, and obtaining the facial video and the sound recording of the report reader when the report reader participates in the conference;
performing character recognition on the sound recording to obtain a conversation text;
semantic extraction is carried out on the conversation text, and keywords in the conversation text and the occurrence frequency of each keyword in an audio file are obtained;
acquiring the volume corresponding to each keyword every time from the audio file;
performing facial expression recognition analysis on the image in the facial video to determine a time point of a preset expression in the facial video; determining keywords appearing in a preset time period; the preset time period is a time period from a first time point before the time point to a second time point after the time point;
determining the importance index corresponding to each keyword according to the frequency of the keyword appearing in the audio file, the volume corresponding to each keyword appearing and whether each keyword appears in the preset time period:
arranging the importance indexes corresponding to all the keywords in a descending order to obtain a keyword importance sequence;
selecting N keywords corresponding to the importance indexes of N top-ranked importance indexes from the keyword importance sequences;
determining indexes corresponding to the N keywords respectively according to a prestored index and keyword comparison table;
taking indexes corresponding to the N keywords as key indexes concerned by the report reader;
determining the importance index corresponding to each keyword according to the number of times of occurrence of each keyword in the audio file, the volume corresponding to each keyword in each occurrence and whether each keyword occurs in the preset time period, wherein the determining the importance index corresponding to each keyword comprises:
calculating the importance index corresponding to the current keyword according to the following formula:
Figure FDA0003012442470000031
wherein d isjIs the importance index, α, of the jth keyword1Is a preset first weight coefficient, the value of which is more than 0 and less than or equal to 1; f. ofjIndicating whether the jth keyword appears in the preset time period or not, and if yes, fjA value of 1, when not fjThe value is 0; alpha is alpha2Is a preset second weight coefficient, the value of which is more than 0 and less than or equal to 1; riIs the number of times the ith keyword appears in the audio file, i is 1,2,3 … N; n is the number of all keywords appearing in the audio file; rjIs the total number of occurrences of the jth keyword in the audio file; alpha is alpha3Is a preset third weight coefficient, the value of which is more than 0 and less than or equal to 1; xiIs the maximum volume, X, of the ith keyword in the audio filejIs the maximum volume of the jth keyword in the audio file; wherein alpha is123=1。
4. The method of claim 1,
screening out preset objects containing the image identifications from the database as index data corresponding to the target indexes, wherein the steps B31-B35 are as follows:
step B31, acquiring employee identification of the report reader, wherein the employee identification is used for indicating the job level of the report reader in the work institution where the report reader is located;
step B32, according to a first corresponding relation table between each employee and the managed object preset by the working mechanism, determining the managed object corresponding to the report reader according to the employee identification of the report reader; the managed object may be either a person or an object or both;
step B33, according to a second corresponding relation table between each managed object and the image identification preset by the working mechanism, determining the image identification corresponding to the managed object corresponding to the report reader;
step B34, judging whether there is an association between the image mark corresponding to the managed object corresponding to the report reader and the key image mark;
when the character mark corresponding to the managed object is a first character feature and the key character mark comprises a second character feature, judging whether the similarity between the two character features is equal to or greater than a first preset similarity or not, and if so, judging that the two character features have an association relation;
when the image identifier corresponding to the managed object is the identification feature of the first object and the key image identifier comprises the identification feature of the second object, judging whether the similarity between the identification features of the two objects is equal to or greater than a second preset similarity or not, and if so, judging that the two objects have an association relation;
when the image identifier corresponding to the managed object is the character feature and the key image identifier comprises the identification feature of the object, analyzing whether the character feature and the identification feature of the object exist in a preset number of images simultaneously in the image database through big data according to the images in the pre-stored image database, and if so, judging that the character feature and the identification feature of the object exist in an association relationship;
when the image identifier corresponding to the managed object is the identification feature of the object and the key image identifier comprises the character feature, analyzing whether the character feature and the identification feature of the object exist in a preset number of images simultaneously in the image database through big data according to the pre-stored images in the image database, and if so, judging that the character feature and the identification feature of the object exist in an association relationship;
step B35, when the judgment result in the step B34 shows that the association exists, adding the image identifier corresponding to the managed object corresponding to the report reader into the key image identifier to form the adjusted key image identifier; screening out photos, pictures, photos or images containing the image corresponding to the adjusted key image identifier from a database, and taking the photos or images as index data corresponding to the target index;
and when the judgment result in the step B34 is that no association exists, keeping the key image identifier unchanged, and screening out a photo, a picture, a photo or an image containing the image corresponding to the key image identifier from the database as index data corresponding to the target index.
5. The method of claim 1,
when the index attribute identification corresponding to the key index in the form class template is an image class identification, the image class identification only includes two characters of 'photo', or only includes two characters of 'picture', or only includes two characters of 'photo', or only includes two characters of 'image':
the screening of the index data corresponding to the key indexes from the database comprises the following steps:
acquiring a key image identifier corresponding to a key index from an index identifier of the key index, wherein the key image identifier is an image identification identifier of a key object required to be contained in a photo, a picture, a photo or an image associated with the key index;
screening out a preset object containing an image corresponding to the key image identifier from a database, and taking the preset object as index data of the key index under the attribute of the image identifier; the preset object comprises a photo, a picture, a photo or an image.
6. The method of claim 5,
screening out preset objects containing the key image identifiers from the database, wherein the preset objects are used as index data of the key indexes under the attribute of the image class identifiers, and the method comprises the following steps of C31-C35:
step C31, acquiring employee identification of the report reader, wherein the employee identification is used for indicating the job level of the report reader in the work mechanism where the report reader is located;
step C32, according to a first corresponding relation table between each employee and the managed object preset by the working mechanism, determining the managed object corresponding to the report reader according to the employee identification of the report reader; the managed object may be either a person or an object or both;
step C33, according to the second corresponding relation table between each managed object and its image mark preset by the working mechanism, determining the image mark corresponding to the managed object corresponding to the report reader;
step C34, judging whether there is an association between the image mark corresponding to the managed object corresponding to the report reader and the key image mark;
step C35, when the judgment result of the step C34 shows that the incidence relation exists, adding the image identification corresponding to the managed object corresponding to the report reader into the key image identification to form the adjusted key image identification; screening out photos, pictures, photos or images containing the images corresponding to the adjusted key image identifications from a database, and taking the photos or images as index data of the key indexes under the attribute of the image identification;
when the judgment result of the step C34 is that there is no association, the key image identifier is kept unchanged, and a photo, a picture, a photograph or an image including the image corresponding to the key image identifier is screened out from the database, and is used as the index data of the key index under the attribute of the image class identifier.
7. An artificial intelligence based data report generating system, comprising:
the first determining module is used for determining key indexes concerned by the report reader;
the acquisition module is used for acquiring a form report template and writing the index identifier of the key index into an index identifier filling area to be filled in the form report template;
the second determining module is used for determining other indexes except the key indexes in the form type report template;
the first screening module is used for screening out index data corresponding to the key indexes from a database;
the second screening module is used for screening the index data corresponding to the other indexes from the database;
the import module is used for writing the index data corresponding to the key indexes into the index data import areas corresponding to the key indexes and writing the index data corresponding to other indexes into the index data import areas corresponding to other indexes, so as to generate a data report;
when the other indexes comprise the target index, the target index is the index identification which only comprises two characters of 'photo', or only comprises two characters of 'picture', or only comprises two characters of 'photo', or only comprises two characters of 'image':
the screening out the index data corresponding to the other indexes from the database comprises the following steps:
determining whether the target indexes and the key indexes have an association relationship or not according to corresponding index data filling areas of the key indexes in the form type report template and corresponding index data filling areas of the target indexes in the form type report template;
when the target index and the key index have an association relationship, acquiring a key image identifier corresponding to the key index from the index identifier of the key index, wherein the key image identifier is an image identification identifier of a key object to be included in a photo, a picture, a photo or an image associated with the key index;
screening out a preset object containing an image corresponding to the key image identifier from a database, and taking the preset object as index data corresponding to the target index; the preset object comprises a photo, a picture, a photo or an image;
in this case, the writing of the index data corresponding to the other index into the index data lead-in area corresponding to the other index includes:
and after writing the preset object of the image corresponding to the key image identifier into the corresponding index data import area of the target index, modifying the index identifier of the target index in the table format template into a target identification character, wherein the target identification character is used for indicating that the data written into the corresponding index data import area of the target index is a photo, a picture, a photo or an image related to the key index.
CN202010477594.0A 2020-05-29 2020-05-29 Data report generation method and system based on artificial intelligence Active CN111639200B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010477594.0A CN111639200B (en) 2020-05-29 2020-05-29 Data report generation method and system based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010477594.0A CN111639200B (en) 2020-05-29 2020-05-29 Data report generation method and system based on artificial intelligence

Publications (2)

Publication Number Publication Date
CN111639200A CN111639200A (en) 2020-09-08
CN111639200B true CN111639200B (en) 2021-05-25

Family

ID=72331609

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010477594.0A Active CN111639200B (en) 2020-05-29 2020-05-29 Data report generation method and system based on artificial intelligence

Country Status (1)

Country Link
CN (1) CN111639200B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114004212B (en) * 2021-12-30 2022-03-25 深圳希施玛数据科技有限公司 Data processing method, device and storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109299854A (en) * 2018-08-23 2019-02-01 深圳思锟软件有限公司 Identify industry intelligent management system

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105069096A (en) * 2015-08-06 2015-11-18 厦门二五八集团有限公司 Architecture method and architecture system for user-defined worksheet
CN105657329B (en) * 2016-02-26 2018-11-20 苏州科达科技股份有限公司 Video conferencing system, processing unit and video-meeting method
CA2974401A1 (en) * 2016-07-25 2018-01-25 Bossanova Systems, Inc. A self-customizing, multi-tenanted mobile system and method for digitally gathering and disseminating real-time visual intelligence on utility asset damage enabling automated priority analysis and enhanced utility outage response
CN108366216A (en) * 2018-02-28 2018-08-03 深圳市爱影互联文化传播有限公司 TV news recording, record and transmission method, device and server
CN109657214A (en) * 2018-09-27 2019-04-19 深圳壹账通智能科技有限公司 Report form generation method, device, terminal and storage medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109299854A (en) * 2018-08-23 2019-02-01 深圳思锟软件有限公司 Identify industry intelligent management system

Also Published As

Publication number Publication date
CN111639200A (en) 2020-09-08

Similar Documents

Publication Publication Date Title
US10824874B2 (en) Method and apparatus for processing video
CN109325148A (en) The method and apparatus for generating information
CN109117777A (en) The method and apparatus for generating information
US10347250B2 (en) Utterance presentation device, utterance presentation method, and computer program product
US11567991B2 (en) Digital image classification and annotation
CN106021496A (en) Video search method and video search device
CN102855317B (en) A kind of multi-mode indexing means and system based on demonstration video
CN107436916B (en) Intelligent answer prompting method and device
CN111191022A (en) Method and device for generating short titles of commodities
CN109902670A (en) Data entry method and system
CN111223487B (en) Information processing method and electronic equipment
CN111639200B (en) Data report generation method and system based on artificial intelligence
CN112199932A (en) PPT generation method, device, computer-readable storage medium and processor
CN113949828B (en) Video editing method, device, electronic equipment and storage medium
CN111488813A (en) Video emotion marking method and device, electronic equipment and storage medium
JP2018169697A (en) Video data processing apparatus, video data processing method, and computer program
CN113297345B (en) Analysis report generation method, electronic equipment and related product
KR101440887B1 (en) Method and apparatus of recognizing business card using image and voice information
CN116738250A (en) Prompt text expansion method, device, electronic equipment and storage medium
CN113010725B (en) Musical instrument selection method, device, equipment and storage medium
JPWO2020071216A1 (en) Image search device, image search method and image search program
CN115168650B (en) Conference video retrieval method, device and storage medium
CN111782762A (en) Method and device for determining similar questions in question answering application and electronic equipment
CN110275988A (en) Obtain the method and device of picture
US20230394854A1 (en) Video-based chapter generation for a communication session

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
CB02 Change of applicant information

Address after: 200237 9 / F and 10 / F, building 2, No. 188, Yizhou Road, Xuhui District, Shanghai

Applicant after: Shanghai squirrel classroom Artificial Intelligence Technology Co.,Ltd.

Address before: 200237 9 / F and 10 / F, building 2, No. 188, Yizhou Road, Xuhui District, Shanghai

Applicant before: SHANGHAI YIXUE EDUCATION TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address

Address after: 200233 9 / F, 10 / F, building 2, 188 Yizhou Road, Xuhui District, Shanghai

Patentee after: SHANGHAI YIXUE EDUCATION TECHNOLOGY Co.,Ltd.

Address before: 9 / F and 10 / F, building 2, No. 188, Yizhou Road, Xuhui District, Shanghai, 200237

Patentee before: Shanghai squirrel classroom Artificial Intelligence Technology Co.,Ltd.

CP03 Change of name, title or address