Do you add timestamps?
Yes, our software automatically adds timestamps, which you can edit if you wish to do so in an online editor.
Is it possible to train the speech recognition on specific vocabulary?
Yes it is, and it will help to make the speech recognition software more accurate. For more information, please contact us here.
Are there limitations on the number of files I can upload?
No, you can upload as many files as you would like.
How accurate is the speech recognition?
Our speech recognition software can deliver the highest level of accuracy on the market. To increase accuracy, you can request a customized engine, which includes specific terms, accents or vocabulary. To find out more, please contact us here.
How does the software recognize different speakers and times at which they speak?
Several techniques for speaker and time recognition are used, our standard solutions include x-vector diarization or 2-channel diarization.
Do you offer transcription services of pre-recorded files?
Yes, our transcription services can be used for many recorded audio and video formats.
We offer both automatic and manual transcription services, as well as automatic and manual subtitling and captioning services.
Do you offer real-time transcription services?
Yes we do, we provide real-time transcription and subtitling services regularly in a variety of use cases. For more information please reach out to our sales team here.
Do you offer on-premise transcription services?
We do have an on-premise service, which is deployed in customized high volume cases. Please reach out to [email protected] to find out more.
Do you offer cloud transcription services?
Yes, our services are offered on the cloud.
Is multi-channel transcription supported?
No, our standard API does not support multi-channel transcription, however please reach out to our sales team here in order to find the perfect solution for your case. as we do have access to this technology.
What sampling rate do I need on my audio files?
You can upload any sampling rate. However, the quality of automatic transcription is highly dependent on the quality of the audio. Amberscript’s models are trained on a variety of audio files with different sampling rates including 8Hz or 16 Hz in order to make the speech recognition as robust as possible.
Can you automatically detect the language of an audio file?
No, our standard API does not support language detection, however please reach out to our sales team here in order to find the perfect solution for your situation as we do have access to this technology.
Which audio file formats are supported?
The speech-to-text API supports the following audio file formats: MP3, MP4, WAV, M4A, M4V, MOV, WMA, AAC, OPUS, FLAC and MPG. If you require a different file format, then please contact us here.
Which languages are supported?
Our API supports the following languages: English, German, Dutch, French, Spanish, Italian, Portuguese, Danish, Swedish, Finnish and Norwegian.
What is the price?
The price of our speech-to-text API is dependent on the use-case. For more information, please contact our sales team here.
Where can I find the API documentation?
The API documentation can be found here: https://github.com/amberscript/api
How can ASR models be customized?
Our speech scientists can train ASR models to recognize specific vocabulary,
terms or jargon and thereby increase accuracy significantly.
To learn more about our custom solutions please contact our sales team here.
How can I get Amberscript’s API?
You can request an API key by filling out this form.