Recognize Speech with Whisper#

Recognizes speech from an audio file to text. Supported format: mp3. Maximum file size - 25 MB.

Audio File[Text] The name and path of the input audio file. Supported format: mp3. Maximum file size - 25 MB.
ModelSelect a model for speech recognition.
Audio Language

[Text] The language of the audio. If necessary, you can specify the language value in ISO 639-1 format. This will improve recognition accuracy and increase processing speed.

For example:

  • "en" - English;
  • "ru" - Russian.
Prompt[Text] If necessary, you can specify a prompt for the language model. The prompt language must match the audio file language.
Temperature

[Number] Sampling temperature from 0 to 1. Higher values, such as 0.8, will make the output more random, while lower values (e.g., 0.2) will make it more focused and deterministic.

If set to 0, the model will use logarithmic probability to automatically raise the temperature until certain threshold values are reached.

Timeout[Number] Maximum wait time for a response in seconds.
Result[Text] The recognized string.
Error Handling Level

Select the error handling level. Possible values:

  • "Default" - default;
  • "Ignore" - errors are ignored;
  • "Handle" - errors are handled.

If "Default" is selected, the value from the "Start" block of this diagram will be used.

Message Level

Select the message level that blocks will output during operation. Possible values:

  • "Default" - default;
  • "Release" - output is disabled;
  • "Debug" - main information output;
  • "Detailed" - detailed information output.

If "Default" is selected, the value from the "Start" block of this diagram will be used.

Error Message[Text] Returns detailed information about the error in case of incorrect block execution.