--- license: mit language: - ru pipeline_tag: automatic-speech-recognition --- GigaAM v2 [models](https://github.com/salute-developers/GigaAM) converted to ONNX format for [onnx-asr](https://github.com/istupakov/onnx-asr). Install onnx-asr ```shell pip install onnx-asr[cpu,hub] ``` Load GigaAM v2 CTC model and recognize wav file ```py import onnx_asr model = onnx_asr.load_model("gigaam-v2-ctc") print(model.recognize("test.wav")) ``` Load GigaAM v2 RNN-T model and recognize wav file ```py import onnx_asr model = onnx_asr.load_model("gigaam-v2-rnnt") print(model.recognize("test.wav")) ``` Code for models export ```py import gigaam from pathlib import Path onnx_dir = "gigaam-onnx" model_type = "rnnt" # or "ctc" model = gigaam.load_model( model_type, fp16_encoder=False, # only fp32 tensors use_flash=False, # disable flash attention ) model.to_onnx(dir_path=onnx_dir) with Path(onnx_dir, "v2_vocab.txt").open("wt") as f: for i, token in enumerate(["\u2581", *(chr(ord("а") + i) for i in range(32)), ""]): f.write(f"{token} {i}\n") ```