Intelligent movie subtitle making method and system
Technical Field
The invention mainly relates to the technical field of movie subtitles, in particular to an intelligent movie subtitle making method and system.
Background
At present, a voice recognition system is more and more widely applied to various aspects of life, traffic, production and the like, information recording, transmission and sharing are greatly facilitated, extremely high economic benefits are displayed, and the happiness and convenience index of people is improved. With the accumulation of voice data, the accuracy of making video captions using voice recognition functions has steadily increased over traditional manual methods of making video captions at present in terms of speed and overall accuracy. The voice recognition system has two functions, namely, voice transcription; second, the voice and text information are synchronized. The intelligent video caption making system has the functions of intelligently providing basic voice transcription, proofreading and correcting system texts, synchronizing intelligent voice and character time axes, converting system intelligent caption data formats, and finally outputting specific editing engineering files, caption data format files, standard caption synthetic video files and the like.
The intelligent subtitle making combines online and offline and multi-party participation of the machine intelligent processing system and sends the synchronous progress message in real time, so that the intelligent processing data can be corrected in time and synchronously fed back to the artificial intelligent core module, and the intelligent system is continuously corrected. The manual participation not only optimizes and perfects the data continuously, but also more importantly ensures that each flow link can smoothly and smoothly send messages, the state can be really updated in time and the possible problems can be solved at the first time, most of the similar intelligent subtitles are provided with subtitle services in an unattended mode, some human negligence faults can be inevitably caused due to the fact that data files provided by customers are inevitably carefree to generate careless omission, and the system cannot be timely repaired when abnormal downtime occurs, and finally the subtitle making services cannot be provided to really realize the indecisive boundary in the aspect of subtitle accuracy.
Disclosure of Invention
The invention mainly provides an intelligent movie and television subtitle making method and system, which are used for solving the technical problems in the background technology.
The technical scheme adopted by the invention for solving the technical problems is as follows:
an intelligent movie subtitle making method and system comprises the following steps: (S1) data collection, wherein the network server srt _ server is connected with a ts database of the production order system and collects file state data of the auto _ ts intelligent manufacturing system; (S2) data storage, wherein the network client ts _ client is connected with the network server Ex _ ts, receives the real-time data sent by the network server Ex _ ts and locally stores the data; (S3) data processing, namely, the Power _ ts intelligent manufacturing management control terminal receives the system data change message notification, and the system administrator carries out the next processing on the order arrangement, wherein the data processing comprises the processing of voice recognition, voice text synchronization, text proofreading and format.
Preferably, the intelligent voice subtitle making system includes a web server srt _ server, the web server srt _ server is connected to the voice recognition system; the network client ts _ client is connected to the client local system; a ts _ controller display processing module connected to the offline management system; a ts _ staff extension module connected to the offline production system; the ts _ View display processing module is connected to the offline supervisory control display system; the network server srt _ server and the network client ts _ client are both connected to the auto _ ts intelligent manufacturing system, and the ts _ controller display processing module is connected to the network client ts _ client, the srt _ server module and the ts _ controller module respectively.
Preferably, the web server srt _ server collects the instantaneous state of the platform production system and the order data, and provides the processing results of the orders to the data receiving module to complete the deep intelligent classification and data feature extraction, and the order classification and identification extension module is used for providing the application program with the data fine processing classification and the packet file extraction of the data preliminary information.
Preferably, the network client ts _ client is used for the application end of the caption user submitting data to receive the caption result, and submitting order real-time data to the intelligent caption making platform auto _ ts.
Preferably, the intelligent voice subtitle making system further comprises a ts _ server module, wherein the ts _ server module is connected to the auto _ ts intelligent making system and used for receiving and storing key data, receiving data attribute information in auto _ ts, recording and storing various processing information of intelligent processing states, and storing the data to a hard disk for use by auto _ ts and ts _ Viewer application programs.
Preferably, the network client ts _ client can use three modes, namely a webpage, an applet, an APP and a PC application.
Preferably, the step of connecting the web server srt _ server and the web client ts _ client includes: (a) the network client ts _ client application auto _ ts logs in and tries to connect the server subtitle making service host; (b) the network server srt _ server application program is started and connected with the auto _ ts system database, and the server creates a thread for monitoring and waiting for connection and waits for the network connection of the client; (c) the network client ts _ client is successfully connected, the network server srt _ server creates a thread to wait for receiving order data, and the network client ts _ client sends an audio and video file or an additional reference file; sending the data in the current buffer area to a network client ts _ client; (d) the network client ts _ client stores the obtained data into an adata directory and names adata/profiles/1 and adata/profiles/year-month-day/hour-minute, the system reads attributes such as size and duration of a file, generates order data and writes the order data into an auto _ ts database, returns necessary display data of the client, waits for a user to complete payment, and after the network server srt _ server receives a payment success message, the network server srt _ server system reads the auto _ ts database to acquire a device ID of a person responsible for a caption manufacturing platform to send a new order processing notification, quits the connection thread, and closes the connection.
Preferably, the ts _ Viewer display processing module is configured to provide functions of user data message interaction, statistics, historical data, task progress preview and task requirement manual communication adjustment by a user, perform primary multi-level identification processing or single-level or multi-level manual proofreading on an order task to be checked, enter a processing state for a corresponding bound project requirement, access single-level or multi-level manual proofreading, enter a processing state for a corresponding bound project requirement, access a ts _ server system to import real-time data, and send a task progress real-time message to an associated user in a form of a mobile phone short message or APP notification by an auto _ ts platform.
Preferably, the intelligent voice subtitle making system further comprises a task module for waiting for a new order, the task module for waiting for a new order is connected to the auto _ ts intelligent making system, and the ts _ controller display processing module is used for loading a layout of production data of the intelligent subtitle making platform and loading real-time data received from the auto _ ts.
Preferably, the task module for waiting for a new order includes the following three ways,
clicking a new order in a layout, loading an auto _ ts intelligent voice-to-text server module to complete automatic task matching, reading all reference data and format in the order, requiring a client to process key information such as corresponding speed grade and the like, restoring the state to a current task queue, and displaying an order task processing progress image;
(II) the server completes the intelligent caption processing result data; popping up a dialog box, and selecting manual or machine progress intelligent checking and proofreading data files; loading data into a voice to synchronously process a text time axis; finally, updating the current graph and using new task processing progress information data;
and (III) processing, checking, synchronizing, converting the format, packaging the data, submitting the data to a client, and sending an order completion message to remind the client to download result data on the platform.
Compared with the prior art, the invention has the beneficial effects that:
the invention greatly accelerates the terminal caption making period of the film and television industry, introduces the latest artificial intelligence solution into the film and television making field, can analyze historical data from the original traditional post-making system, and displays the data in a form of graph or table on a large screen, a webpage end mobile phone end and the like as customer display, analysis and sharing, and imagination provides rich excavation space. The system fully reflects the production cycle of the film and television industry, the working property of the operators is lower than the stability requirement and other factors, so the system is focused on a platform, is focused on multi-terminal sharing, is focused on multi-level deep processing of intelligent analysis, is focused on mutual detection of manual work and machine intelligence, and has a unique using method on the application and application mode of the current artificial intelligence.
The present invention will be explained in detail below with reference to the drawings and specific embodiments.
Drawings
FIG. 1 is a schematic diagram of the system of the present invention;
FIG. 2 is a schematic diagram of a graphical display system identifying task lines and text in accordance with the present invention;
FIG. 3 is a text data processing flow diagram of the present invention;
FIG. 4 is a flowchart illustrating the operation of the web server srt _ server according to the present invention;
FIG. 5 is a flow chart of the ts _ client operation of the network client in accordance with the present invention;
FIG. 6 is a ts _ controller display processing module workflow diagram of the present invention.
Detailed Description
In order to facilitate an understanding of the invention, the invention will now be described more fully hereinafter with reference to the accompanying drawings, in which several embodiments of the invention are shown, but which may be embodied in different forms and not limited to the embodiments described herein, but which are provided so as to provide a more thorough and complete disclosure of the invention.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs, and the knowledge of the terms used herein in the specification of the present invention is for the purpose of describing particular embodiments and is not intended to limit the present invention, and the term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.
Referring to fig. 1-6, an intelligent movie subtitle making method and system includes the following steps: (S1) data collection, wherein the network server srt _ server is connected with a ts database of the production order system and collects file state data of the auto _ ts intelligent manufacturing system; (S2) data storage, wherein the network client ts _ client is connected with the network server Ex _ ts, receives the real-time data sent by the network server Ex _ ts and locally stores the data; (S3) data processing, wherein the Power _ ts intelligent manufacturing management control terminal receives the system data change message notification, and the system administrator carries out the next processing of order arrangement, and the data processing comprises the processing of voice recognition, voice text synchronization, text proofreading and format.
The intelligent voice subtitle making system comprises a network server srt _ server, wherein the network server srt _ server is connected with the voice recognition system; the network client ts _ client is connected to the client local system; the ts _ controller display processing module is connected with the offline management system; the ts _ staff extension module is connected with the offline production system; the ts _ View display processing module is connected to the offline supervisory control display system; the network server srt _ server and the network client ts _ client are both connected to the auto _ ts intelligent manufacturing system, and the ts _ controller display processing module is respectively connected with the network client ts _ client, the srt _ server module and the ts _ controller module.
The network server srt _ server is an external interface program, uses a fixed port of a TCP, is a data processing center and an interactive interface between an external voice intelligent recognition system and a data receiving module, acquires instantaneous state and order data of a platform production system, provides processing results of the orders to the data receiving module to complete deep intelligent classification and data feature extraction, provides a packet file extraction of data fine finishing classification and data preliminary information for an application program, and provides a packet file extraction of data fine finishing classification and data preliminary information for the application program, and the order classification identification extension module is responsible for increasing intelligent processing speed. For convenient calling, the number of key grouping nodes can be produced by utilizing the classification expansion module, and the key grouping nodes can be rapidly and accurately classified for specific customers or specific file sizes, so that the accuracy rate of intelligent identification is greatly improved.
And the network client ts _ client is used for submitting the application end of the data receiving caption result by the caption user and submitting the order real-time data to the intelligent caption making platform auto _ ts.
The intelligent voice caption making system also comprises a ts _ server module, wherein the ts _ server module is connected with the auto _ ts intelligent making system and is used for receiving and storing key data, receiving data attribute information in auto _ ts, recording and storing various processing information of intelligent processing states, and storing the data to a hard disk for the auto _ ts and ts _ Viewer application programs to use.
The network client ts _ client can use three modes of a webpage, an applet, an APP and a PC application.
The connection step of the network server srt server and the network client ts _ client comprises the following steps: (a) the network client ts _ client application auto _ ts logs in and tries to connect the server subtitle making service host; (b) the network server srt _ server application program is started and connected with the auto _ ts system database, and the server creates a thread for monitoring and waiting for connection and waits for the network connection of the client; (c) the network client ts _ client is successfully connected, the network server srt _ server creates a thread to wait for receiving order data, and the network client ts _ client sends an audio and video file or an additional reference file; sending the data in the current buffer area to a network client ts _ client; (d) the network client ts _ client stores the obtained data into an adata directory and names adata/profiles/1 and adata/profiles/year-month-day/hour-minute, the system reads attributes such as size and duration of a file, generates order data and writes the order data into an auto _ ts database, returns necessary display data of the client, waits for a user to complete payment, and after the network server srt _ server receives a payment success message, the network server srt _ server system reads the auto _ ts database to acquire a device ID of a person responsible for a caption manufacturing platform to send a new order processing notification, quits the connection thread, and closes the connection.
the ts _ Viewer display processing module is used for providing functions of interactive user data information, statistics, historical data, task progress preview and manual communication adjustment task requirement by a user, performing primary multi-level identification processing or single-level or multi-level manual proofreading on order tasks needing to be checked, enabling the bound corresponding project requirements to enter a processing state, accessing single-level or multi-level manual proofreading, enabling the bound corresponding project requirements to enter a processing state, accessing a ts _ server system to import real-time data, and sending the task progress real-time information to an associated user in a form of mobile phone short messages or APP notifications by an auto _ ts platform. the ts _ Viewer display processing module firstly imports the text data which is completely real and correct and can be used without errors from the auto _ ts system, the text of the data is carried out by the network server interface txt _ ts, and the smoothness and the stability of the whole intelligent identification can be ensured only by uniform text and uniform coding. And then the data is transmitted to a network client ts _ txt interface of a large screen system through a coding and checking port, the network client ts _ client performs data right giving and text confirmation, and the ts _ Viewer display processing module restores the state by reading a history file stored by the network client ts _ client, loading and analyzing a ts _ txt graph, a txt _ data graph provided by an auto _ server expansion module and the like, and provides the restored state to srt _ server for key data deep processing. the ts _ Viewer display processing module can display the processing progress and results of the order tasks and associate the feedback information data in the process with the auto _ ts real-time database, the ts _ Viewer can perform data sorting on a large screen, order searching, conditional scheduling, cost bonus statistics and other conventional operations, optimal display schemes which are respectively adapted to different display terminals are provided, the response speed is high, the order task data display content is rich, a task processing progress graph is opened, processing data of each process of the voice recognition progress can be searched from a layout graph, a graph or a table form can be directly generated, multi-terminal information sharing can be realized in a universal data format, the ts _ Viewer display terminal can automatically feed back and collect auto _ ts platform data to the real-time ts database in real time, when a platform technician or an order client browses orders, a status data analysis graph is automatically popped up.
The intelligent voice caption making system also comprises a new order waiting task module which is connected with the auto _ ts intelligent making system, and the ts _ controller display processing module is used for loading a layout of production data of the intelligent caption making platform and loading real-time data received from the auto _ ts.
The task module for waiting for a new order comprises the following three modes,
clicking a new order in a layout, loading an auto _ ts intelligent voice-to-text server module to complete automatic task matching, reading all reference data and format in the order, requiring a client to process key information such as corresponding speed grade and the like, restoring the state to a current task queue, and displaying an order task processing progress image;
(II) the server completes the intelligent caption processing result data; popping up a dialog box, and selecting manual or machine progress intelligent checking and proofreading data files; loading data into a voice to synchronously process a text time axis; finally, updating the current graph and using new task processing progress information data;
and (III) processing, checking, synchronizing, converting the format, packaging the data, submitting the data to a client, and sending an order completion message to remind the client to download result data on the platform.
The auto _ ts display of the present invention uses two sets of boundary processing. The general access point uses a frame-accurate processing mode, which is suitable for strict programs, but education, children learning, emotional drama and the like are obviously more vivid and hard, and the second processing mode is a relaxing mode, which is to prolong the specific plot properly and ensure good visual feedback and skill response. The second intelligent identification buffer area of the system is not directly placed in interactive display, but is created in a system main processing module, and then is drawn to a ts _ controller window by a system main management group in a bit block transmission mode, so that the problem that a graphic system with multi-terminal and multi-type tasks cannot be displayed on a large screen is solved; meanwhile, most of the audio formats provided by the client are mp3, mov, mpeg and the like, but the srt _ server needs to be converted into wav for further depth identification, so when the preprocessed file is loaded, type conversion is usually performed on a function provided in a read order file runtime library, but the efficiency is very low. The effect is extremely high at present after the audio2wav is subjected to repeated iteration upgrading), an srt _ ser _ rver end serves as a basic format conversion task for intelligent processing of voice recognition, and the stable improvement of the system performance can be brought by using the audio2 wav; the invention can not only display the files in the conventional industry such as the files in wav. mp3 format, but also bind the pictures such as BMP, JPEG image ppt, pages, numbers, mp4 and attachment culture, and the ts _ Viewer display processing module can carry out deep data mining and automatic adaptation on the bound information. And prompting the identified order. The mouse can be moved to the selected node to display the result of the preview precision, the final result file is opened in a new window by double clicking, and the file can also be directly dragged to the window to directly open the project engineering file with the result precision capable of being checked; each recognition task line and character in the graphic display system are calculated and determined before intelligent processing to see or not under the current recognition classification full-automatic processing, invisible intelligent recognition cannot be sent to task correction, classification and extension of intelligent recognition are achieved, and therefore processing and platform running speed are improved; for dialect, foreign language, song, which is usually identified by atat _ srt for intelligent task, the system can process the data in srt _ srt _ txt with deep language and specific type. The txt srt-txt module completes feature conversion, and then regresses again according to the recognition result to complete special task processing, so that multiple models are avoided, the risk of high failure rate is recognized, system operation is greatly facilitated, and about one-to-one and half manual workload is reduced. The invention can also directly open or download order processing condition data, bind the mobile phone or mailbox of the appointed message contact person through double-click from the layout chart, can also directly display or view the progress data file on other system platforms, can browse and view historical data, and display the data on the auto _ ts large screen. During operation, the tasks with the appointed dates in the menu are selected, and the corresponding historical data file version nodes are selected. Meanwhile, operations such as task cancellation and suspension can be performed in the ts _ Viewer graph, the order voice recognition or the proofreading process in the right click graph is cancelled and suspended, and the corresponding operations can be recorded in the database and displayed and updated in real time.
The invention is described above with reference to the accompanying drawings, it is obvious that the invention is not limited to the above-described embodiments, and it is within the scope of the invention to adopt such insubstantial modifications of the inventive method concept and solution, or to apply the inventive concept and solution directly to other applications without modification.