Gonsin Conference Equipment Co., LTD.
Gonsin Conference Equipment Co., LTD.

Products

ASR Automatic Speech Recognition System

GONSIN Automatic Speech Recognition System suits various application scenarios, including meeting minutes, training records, real-time speech subtitles, interview records transcription, real-time court trial records, etc. It can merge the text and voice recording of each role, merge and generate meeting minutes, and support text export. The conference system China supports cloud server rental deployment, and local server LAN deployment, artificial intelligence learning, and continuous system optimization.

Speech To Text Software
Asr Automatic Speech Recognition
Speech To Text Software
Asr Automatic Speech Recognition

As new development of modern conference solutions, the automatic speech recognition (ASR) system brings more intelligent human-computer interaction experience.For traditional conferences, the communication by sound and video cannot satisfy the modern conference needs any more. Besides, after the meeting, the document processing, meeting minutes, and legal procedures of specific users are also required to be presented in a words format. Gonsin Automatic Speech Recognition System can achieve real-time, complete, and orderly text transcription from sound, and ensures the text corresponds to each delegate's speech. The transcribed text can be displayed on a large screen, as well as Gonsin paperless conference system in real-time.


ASR system suits various application scenarios, including meeting minutes, training records, real-time speech subtitles, interview records transcription, real-time court trial records, etc.


image.png


Gonsin Automatic Voice Recognition System System Solutions

GONSIN ASR System offers three solutions: Online Speech Recognition and Translation Solution, Lightweight Private Deployment Solution, and Conference Room Cluster Private Deployment Solution.


  • Online Speech Recognition and Translation Solution

  • Lightweight Private Deployment Solution

  • Conference Room Cluster Private Deployment Solution


Automatic Speech Recognition System Advantages

GONSIN Automatic Speech Recognition software system is developed on the platform of GONSIN full digital conference automatic speech recognition technology. By connecting the network audio data and the ASR background, and in the support of ASR and GONSIN application software, it realizes real-time voice transcribing into text. 

Automatic Speech Recognition System

Gonsin Automatic Speech Recognition Software V7.1.0

Basic Functions

  • Support public cloud and proprietary cloud voice server selection docking, which can meet different server deployment methods. Support installation to PC computer or speech recognition server, which can be flexibly applied to a variety of application scenarios.

  • Support ASR server shutdown management function, ASR server, discussion system connection, search, and microphone role customization function, and support the public letter of each series of discussion system seamless docking, conference management, role separation, and automatic identification.

  • Support personnel and equipment management, including equipment search, displaying unit number information, IP address information, and personnel name settings; support meeting information editing, including new meeting name, defining meeting time, location, and meeting content editing.

  • Support simultaneous recognition of multiple microphone roles and anti-crosstalk function, which can effectively avoid mutual crosstalk when multiple microphones are recognized at the same time; support microphone status prompts, which can display the microphone on and off status in real time.

  • Support language model learning function. It support importing common words such as names of people and places to learn the language model.

  • Support automatic identification of participants' roles, automatic identification of participants' voice and transcription into text. The software support translation into other required speech (software functions vary according to engine capabilities)

  • Support intelligent semantic understanding, which can automatically understand the semantics of the participants and automatically break sentences and segments according to the semantics. Support automatic conversion of consecutive numbers to Arabic format, and support automatic identification of cell phone numbers, ID cards and other consecutive numbers converted to Arabic format.

  • Support meeting text editing and correction functions. Generate separate recording files for different roles, or merge the text records and recordings of each role. Voice and text records can be synchronized playback and display against the document correction.

  • Support meeting record output function. Support text merge, generate meeting minutes, and export text.

  • Support content search function, support text content search. Keywords can be searched, quickly locate the position of the corresponding content, greatly improving the efficiency of content retrieval.

  • Support text split-screen output function. Installation to the PC computer, you can realize the transcription text real-time display in the main screen of the operating computer, support the expansion of split-screen output, real-time display of the text content of voice recognition. Support screen customization function, screen resolution adaptive, support text font, size settings, to provide high-quality split-screen text display service.

  • Support recording file recognition, through the recording file import, automatically convert the recording file content into text content; support mp3, wav and other file formats.

  • Support the selection of audio input devices, you can connect the computer's audio input devices, real-time audio input transcription text

  • Support the computer to recognize the current playback sound content, and automatically convert it to text.

  • Support more customized features: the software support Chinese and English switching, as well as other custom languages; support for secondary development, according to the project requirements of the open interface protocol or customized development.


Technical Parameters

SystemWin7 / win8 / win10 operating system 32 / 64 bit
CPUI7 or above
Hard disk capacity500GB or above
Memory capacity16GB or above
Graphics cardThe independent graphics card supports VGA / HDMI / DVI interface and supports split-screen display
PC interface1*RS-232 interface and 2*RJ45 interfaces
Resolutionself-adaptive
PC communicationEthernet/RS-232


Gonsin Automatic Speech Recognition Subtitle Display Software V7.1.0

Basic Functions

  • Good system compatibility, support subtitle display for Windows and Android devices.

  • Support multiple subtitle display mode settings. Support full-screen mode and pop-up mode

  • Full screen mode: display the transcription content in full screen in the form of a dialog box. Support background setting and font setting.

  • Barrage mode: Displays the transcription content in a floating barrage style. Support line setting and font setting

  • Support video overlay subtitle function: support real-time subtitle function overlaying on the video screen, integrated with video conferencing and camera tracking applications.

  • Support paperless overlay subtitle function: Enables real-time subtitle overlay on paperless screens, integrating with paperless systems, and displaying transcribed text in real-time on paperless terminals.

Lightweight Intelligent Automatic Speech Recognition Server

Basic Functions

  • With intelligent voice recognition software, it can realize Web access management

  • Support automatic recognition of participant roles, automatic recognition of participant's voice, and transcription into text

  • With built-in ASR Engine, adopt industry-leading online speech recognition technology, deployed through the cloud to provide speech recognition services for local speech. Low latency, high recognition accuracy, accuracy rate can reach more than 99%

  • Speech recognition server can realize speech transcription of different channels:

  • GX-AS201: supports 1-way speech recognition capability

  • GX-AS202: Supports 2-way speech recognition capability

  • GX-AS205: supports 5-way speech recognition capability

  • GX-AS208: supports 8-way speech recognition capability

  • Support customized language recognition, such as Chinese, English, Spanish, Arabic, Russian and French.

  • Support recognition in multiple application scenarios: education, judicial, medical, conference speech, news media, entertainment video, smart home, social, automotive and so on

  • Support multiple conference rooms to share the server. Support multiple conference rooms in the conference center to form a LAN and centrally deploy the server to meet the parallel speech recognition and transcription in multiple conference rooms.

  • With intelligent speech recognition subtitle display software, provide subtitle display service for conferences.

Technical Parameters

ModelGX-AS201GX-AS202GX-AS205GX-AS208
System versionCentos7.4+
CPUi3i7
Memory capacity16G32G
Hard disk256G SSD500G SSD
Front panel interface4×USB2.0 Type-A, 1×3.5mm Line out, 1×3.5mm Micin, 1×Power button, 1×Power LED
Rear panel interface4×USB3.0 Type-A, 1×RJ4510/100/1000M, 1×HDMI 1.4 out, 1×COM out, 1×3.5mm Line out, 1×3.5mm Mic in, 1×WIFI/BT ANT
Power input19V DC
Operating temperature-5°C~45°C
Storage temperature-20°C~60°C
Volume210(L)×210 (W)×56 (H) mm


ASR Automatic Speech Recognition Server GX-AS301

Basic Functions

  • 2U standard rack-mounted server with stable and reliable performance, adopting SGCC galvanized steel plate, environmentally friendly exterior paint, fingerprint resistance, and resistance to contact 4kV strong magnetic interference

  • Adopt high-performance configuration LINNUX server, install ASR Engine V3.0 software to realize automatic identification of participants' roles, automatic recognition of participants' voices and transcription into text.

  • Support multiple conference rooms to share the server. Support multiple conference rooms in the conference center to form a LAN and centrally deploy the server to meet the needs of multiple conference rooms for parallel speech recognition and transcription.

  • Co-work with intelligent speech recognition subtitle display software to provide subtitle display service for meetings

  • High-efficiency CTC model, through the optional authorization, a single server supports a maximum of 50 concurrent recognition.

  • The server adopts SSL encryption mechanism to effectively ensure the storage security and transmission security of sensitive information. RC4, MD5 and RSA encryption algorithms are used to ensure the security of platform data and avoid leakage of important information.

  • Built-in power management embedded software. It can monitor the voltage status to avoid equipment failure caused by voltage fluctuation and realize all-weather protection.


GONSIN Automatic Speech Recognition Engine V3.1/V3.2

  • With industry-leading online speech recognition technology, deployed through the cloud to provide speech recognition services for local speech. Low latency, high recognition accuracy, accuracy rate can reach more than 99%

  • The engine adopts a package payment model, effectively reducing the input cost and construction threshold of speech recognition. Users can purchase the package program of appropriate length according to the actual demand for the length of speech recognition (please purchase the package service in time to ensure the normal use of the engine)

  • Support role-separated recognition: different original languages and translation languages can be selected according to different roles, so as to realize simultaneous recognition of multiple languages, transcription into corresponding text, and translation.

  • Support multiple major languages, such as Chinese, English, French, Russian, Arabic and Spanish.

  • With intelligent speech recognition subtitle display software, it can display the original text and translated text at the same time, or set to display the original/translated text separately, providing subtitle service for business negotiation and video conference in different languages.


GONSIN Automatic Speech Recognition Engine V3.0

  • Adopt intelligent language recognition model technology, based on AI technology to achieve speech recognition

  • Support customized language recognition, such as Chinese, English, Spanish, Arabic, Russian and French

  • Support recognition in multiple application scenarios: education, judicial, medical, conference speech, news media, entertainment video, smart home, social, automotive and so on

Any automated speech recognition software, regardless of its complexity, can extract and decompose your words for analysis and response, and its basic event sequence is listed as follows:

1.Talk to the software via audio input.

2.The automatic voice recognition you are speaking to will generate a wave file of your words.

3.The waveform files were cleaned by removing background noise and normalized volume.

4.The filtered waveforms are decomposed into so-called phonemes. (Phonemes are the basic components of language and word pronunciation. There are 44 such words in English, consisting of vocal blocks such as "wh", "th", "ka", and "t").

5.Each phoneme acts like a chain, starting with the first phoneme and analyzing them in sequence, and the ASR speech recognizer uses statistical probabilistic analysis to infer the whole word, and then infer complete sentences from there.

6.Your ASR automatic speech recognition software, which now "understands" your words, can respond to you in a meaningful way.



How Does Automatic Speech Recognition Software Work?

Any automated speech recognition software, regardless of its complexity, can extract and decompose your words for analysis and response, and its basic event sequence is listed as follows:

1.Talk to the software via audio input.

2.The automatic voice recognition you are speaking to will generate a wave file of your words.

3.The waveform files were cleaned by removing background noise and normalized volume.

4.The filtered waveforms are decomposed into so-called phonemes. (Phonemes are the basic components of language and word pronunciation. There are 44 such words in English, consisting of vocal blocks such as "wh", "th", "ka", and "t").

5.Each phoneme acts like a chain, starting with the first phoneme and analyzing them in sequence, and the ASR speech recognizer uses statistical probabilistic analysis to infer the whole word, and then infer complete sentences from there.

6.Your ASR automatic speech recognition software, which now "understands" your words, can respond to you in a meaningful way.


Automatic Speech Recognition Tutorial


System Configuration Products of ASR Automatic Speech Recognition System




Contact Us

Gonsin is here to offer you the customized solutions for conference audio and video system.

Please fill in the information truthfully so that we can contact you and provide services as soon as possible.



Related Products of ASR Automatic Speech Recognition System
Delivering Trust & Value
You can
trust .
Copyright © Gonsin Conference Equipment Co., LTD. All Rights Reserved.
The information and specifications included are subject to change without prior notice.