Products
GONSIN Automatic Speech Recognition System suits various application scenarios, including meeting minutes, training records, real-time speech subtitles, interview records transcription, real-time court trial records, etc. It can merge the text and voice recording of each role, merge and generate meeting minutes, and support text export. The conference system China supports cloud server rental deployment, and local server LAN deployment, artificial intelligence learning, and continuous system optimization.
As new development of modern conference solutions, the automatic speech recognition (ASR) system brings more intelligent human-computer interaction experience. For traditional conferences, 70% of the meeting information depends on visual reception, and only 30% depends on sound reception. Communication by sound and video cannot satisfy modern conference needs anymore. Besides, after the meeting, the document processing, meeting minutes, and legal procedures of specific users are also required to be presented in a words format. Gonsin Automatic Speech Recognition System can achieve real-time, complete, and orderly text transcription from sound, and ensures the text corresponds to each delegate's speech. The transcribed text can be displayed on a large screen, as well as Gonsin paperless conference system in real-time.
ASR system suits various application scenarios, including meeting minutes, training records, real-time speech subtitles, interview records transcription, real-time court trial records, etc.
There are two selective modes for ASR background: local server LAN, and cloud platform, so to meet different application requirements. Both modes shall be assorted with Gonsin intelligent conference management software—automatic speech recognition module.
Gonsin Automatic Speech Recognition system is developed on the platform of Gonsin full digital conference automatic speech recognition technology. By connecting the network audio data and the ASR background, and in the support of ASR and Gonsin application software, it realizes real-time voice transcribing into text.
The ASR automatic speech recognition software system is equipped with a conference system, adapt to noisy environment, clear sound pickup
Real-time speech recognition of each role to generate a separate voice recording file
The speech of each role is recognized and transcribed into text in real-time, and a separate text file is generated
The automatic voice recognition software can be used with Gonsin 20000S series or Leader series conference system, supports multiple microphones active at the same time. The voice of each microphone can be recognized in real-time; A separate voice recording file is generated and transcribed into text (the authorized number of voice transcribing modules should match the number of simultaneously active microphones)
ASR speech recognizer can be used with Gonsin Z4 Series conference system, to support one active microphone. The voice of the microphone can be recognized in real-time, generate a separate voice recording file, and transcribe into text
ASR automatic speech recognition software system can merge the text and voice recording of each role, merge and generate meeting minutes, and support text export
Intelligent semantic recognition and intelligent sentence segmentation based on semantics
Voice recording and transcribed text can be played back synchronously and displayed in contrast to realizing intelligent document correction
The speech to text software supports the keyword retrieval function, can locate the corresponding content quickly, and greatly improves the efficiency of content retrieval
Different types of speech voice recognition software support the main screen and split-screen display, real-time display of transcribed text on the main screen from the operating computer, and put it into the large-screen display system, supporting screen resolution adaptive
Transcribed text can be displayed on Gonsin paperless terminal in real-time
Support conference center cluster deployment or local conference room deployment, artificial intelligence learning, system continuous optimization
GONSIN Automatic Speech Recognition software system is developed on the platform of GONSIN full digital conference automatic speech recognition technology. By connecting the network audio data and the ASR background, and in the support of ASR and GONSIN application software, it realizes real-time voice transcribing into text. There are two selective modes for ASR background: local server LAN, and cloud platform, so to meet different types of voice recognition software application requirements. Both modes shall be assorted with GONSIN intelligent conference management software—automatic speech recognition module.
Automatic Speech Recognition Module V7.1.0 (ASR) is the voice transcribing function module of conference management software V7.1.0, which realizes the voice to text function. Before the meeting, set the conference units of each participant with the corresponding roles. During meetings, the speech recognition module can recognize the voice flow of each conference unit in real-time, generate independent voice recording files and transcribed text files of each role synchronously, and present them in the operation computer and large screen display. Also, it can be saved as a text + voice meeting minutes file based on the set template.
Real-time speech recognition of each role to generate a separate voice recording file.
The speech of each role is recognized and transcribed into text in real-time, and a separate text file is generated.
It can be used with Gonsin 20000S series or Leader series conference system, supporting multiple microphones that are active at the same time. The speech of each microphone can be recognized in real-time; A separate voice recording file is generated and transcribed into text (the authorized number of voice transcribing modules should match the number of simultaneously active microphones).
It can be used with the Gonsin Z4 Series conference system to support one active microphone. The voice of the microphone can be recognized in real-time, generate in a separate voice recording file and transcribe into text.
It can merge the text and voice recording of each role, merge and generate meeting minutes, and support text export.
Intelligent semantic recognition and intelligent sentence segmentation based on semantics.
Voice recording and transcribed text can be played back synchronously and displayed in contrast to realizing intelligent document correction.
It supports the keyword retrieval function, can locate the corresponding content quickly, and greatly improves the efficiency of content retrieval.
It supports the main screen and split-screen display, real-time display of transcribed text on the main screen of the operating computer, and put it into the large-screen display system, support screen resolution adaptive.
Transcribed text can be displayed on Gonsin paperless terminal in real-time.
Conference system management and setting (e.g. equipment search, terminal ID, sensitivity)
Conference information editing and management (conference content editing, personnel information setting, conference unit role setting, etc.)
Compatible with different Gonsin conference system series.
Support screen customization, the editing of visual interface e.g. text font, color, picture and associated data. Support fast switching of multiple interface styles.
The software supports secondary development; the interface protocol can be open for customized development on project requests.
System | Win7 / win8 / win10 operating system 32 / 64 bit |
CPU | I7 or above |
Hard disk capacity | 500GB or above |
Memory capacity | 16GB or above |
Graphics card | The independent graphics card supports VGA / HDMI / DVI interface and supports split-screen display |
PC interface | 1*RS-232 interface and 2*RJ45 interfaces |
Resolution | self-adaptive |
PC communication | Ethernet/RS-232 |
The Lightweight Automatic Speech Recognition Server is the computing product of intelligent speech recognition for small and medium-sized conferences. Built-in Automatic Speech Recognition Engine, it has the characteristics of fast transcribing speed, high recognition rate, easy deployment, and strong stability. It can meet the requirement of transcription from voice to text in the meeting scenes, realizing effective meeting minutes. At the same time, the server is small in size, has simple in-system docking, is convenient to use, and easy to carry, which can meet the needs of rapid device switching and system construction for different conference venues, and realize effective device sharing. It is suitable for the fixed venue, temporary venue and rental meeting scenes.
Built-in Automatic Speech Recognition Engine
Different types of ASR servers can achieve different channels of voice recognition
GX-AS201: Support 1 channel voice recognition
GX-AS202: Support 2-channel voice recognition
GX-AS205: Support 5-channel voice recognition
GX-AS208: Support 8-channel speech recognition
The leading single-pass large-scale language model decoding technology
Support 17 languages: Chinese, English, Japanese, Spanish, Arabic, Korean, Kazakh, Russian, French, Indonesian, Vietnamese, Filipino, Hindi, German, Italian, Malay, Thai
It can customize an industry recognition engine for finance, politics and law, medical treatment, education, etc.
With GONSIN conference management system, supports multiple microphones active at the same time. The voice of each microphone can be recognized in real-time, and a separate voice recording file is generated and transcribed into text.
Model | GX-AS201 | GX-AS202 | GX-AS205 | GX-AS208 |
System version | Centos7.4+ | |||
CPU | i3 | i7 | ||
Memory capacity | 16G | 32G | ||
Hard disk | 256G SSD | 500G SSD | ||
Front panel interface | 4×USB2.0 Type-A, 1×3.5mm Line out, 1×3.5mm Micin, 1×Power button, 1×Power LED | |||
Rear panel interface | 4×USB3.0 Type-A, 1×RJ4510/100/1000M, 1×HDMI 1.4 out, 1×COM out, 1×3.5mm Line out, 1×3.5mm Mic in, 1×WIFI/BT ANT | |||
Power input | 19V DC | |||
Operating temperature | -5°C~45°C | |||
Storage temperature | -20°C~60°C | |||
Volume | 210(L)×210 (W)×56 (H) mm |
Industry-leading single pass large-scale language model decoding technology.
Support 17 languages: Chinese, English, Japanese, Spanish, Arabic, Korean, Kazakh, Russian, French, Indonesian, Vietnamese, Filipino, Hindi, German, Italian, Malay, Thai.
It can customize industry identification engines for finance, politics and law, medical treatment, education, etc.
High-efficiency CTC model supports up to 50 simultaneous speech recognition channels by optional authorization.
Support the centralized deployment of conference center multiple conference rooms LAN, satisfy simultaneous voice transcribing of multiple conference rooms.
Assorted with Gonsin management software, the roles can be separated and identified.
ASR Automatic Speech Recognition engine V3.0 software is installed in the intelligent speech recognition server to run.
NO. | Name | Model | Qty | Unit | Function | Remarks | |
A: Lightweight Privatization Deployment Scheme | |||||||
A.1 |
Lightweight Automatic Speech Recognition Server |
GX-AS201 |
1 |
pcs |
Built in Automatic Speech Recognition Engine,Support 1 channel voice recognition | Support the discussion system of Z4/ 10000N/20000S/30000S/Leader Series voice transcription and automatic role-based recognition Note: 1 microphone active at the same time | |
A.2 |
Lightweight Automatic Speech Recognition Serve |
GX-AS202 |
1 |
pcs |
Built in Automatic Speech Recognition Engine,Support 2 channel voice recognition | Support the discussion system of 10000N/20000S/30000S/Leader Series voice transcription and automatic role-based recognition Note: 2 microphones active at the same time | |
A.3 |
Lightweight Automatic Speech Recognition Serve |
GX-AS205 |
1 |
pcs |
Built in Automatic Speech Recognition Engine,Support 5 channel voice recognition | Support the discussion system of 10000N/20000S/30000S/Leader Series voice transcription and automatic role-based recognition Note: 5 microphones active at the same time | |
A.4 |
Lightweight Automatic Speech Recognition Serve |
GX-AS208 |
1 |
pcs |
Built in Automatic Speech Recognition Engine,Support 8 channel voice recognition | Support the discussion system of Leader Series voice transcription and automatic role-based recognition Note: 8 microphones active at the same time | |
B: Conference Room Cluster (LAN) Privatization Deployment Scheme | |||||||
B.1 | Automatic Speech Recognition Engine |
V3.0 |
1 |
set | Automatic Speech Recognition Engine | Support: 1.Voice recognition of multiple conference rooms in LAN (cluster privatization deployment of conference center / conference room) 2.Support GONSIN all series of discussion system products Note: support at most 50 channels of voice transcription active at the same time (according to the authorized number of voice transcription module) | |
B.2 |
Automatic Speech Recognition Server |
GX-AS301 |
1 |
pcs | Support at most 50 channels of voice recognition concurrent authorization | ||
B.3 |
Voice Transcribed Module Authorization |
V1.0 |
N |
way | According to the number of meeting rooms requiring simultaneous voice transcription in the LAN | ||
C: ASR Application Terminal | |||||||
C.1 | Gonsin Intelligent Conference Management Software ASR Module |
V7.1.0 (ASR) |
1 |
set |
- | Recognize the voice flow of each conference unit, generate independent voice recording file and transcribed text file of each role synchronously, and present them in the operation computer and large screen display | |
C.2 |
Control Computer | Can be purchased by customer |
1 |
pcs |
- | Install Gonsin Intelligent Conference Management Software - ASR Module, recognize the voice flow of each conference unit, and generate independent voice recording file and transcribed text file of each role synchronously. | |
D: Conference Discussion System | |||||||
Can co-work with GONSIN all series of discussion system, choose the product configuration according to the needs | |||||||
Note: Solution 1 : A+C+D ; Solution 2: B+C+D |
Any automated speech recognition software, regardless of its complexity, can extract and decompose your words for analysis and response, and its basic event sequence is listed as follows:
1.Talk to the software via audio input.
2.The automatic voice recognition you are speaking to will generate a wave file of your words.
3.The waveform files were cleaned by removing background noise and normalized volume.
4.The filtered waveforms are decomposed into so-called phonemes. (Phonemes are the basic components of language and word pronunciation. There are 44 such words in English, consisting of vocal blocks such as "wh", "th", "ka", and "t").
5.Each phoneme acts like a chain, starting with the first phoneme and analyzing them in sequence, and the ASR speech recognizer uses statistical probabilistic analysis to infer the whole word, and then infer complete sentences from there.
6.Your ASR automatic speech recognition software, which now "understands" your words, can respond to you in a meaningful way.
Any automated speech recognition software, regardless of its complexity, can extract and decompose your words for analysis and response, and its basic event sequence is listed as follows:
1.Talk to the software via audio input.
2.The automatic voice recognition you are speaking to will generate a wave file of your words.
3.The waveform files were cleaned by removing background noise and normalized volume.
4.The filtered waveforms are decomposed into so-called phonemes. (Phonemes are the basic components of language and word pronunciation. There are 44 such words in English, consisting of vocal blocks such as "wh", "th", "ka", and "t").
5.Each phoneme acts like a chain, starting with the first phoneme and analyzing them in sequence, and the ASR speech recognizer uses statistical probabilistic analysis to infer the whole word, and then infer complete sentences from there.
6.Your ASR automatic speech recognition software, which now "understands" your words, can respond to you in a meaningful way.
Gonsin is here to offer you the customized solutions for conference audio and video system.