Introduction

This task aims to develop a voice recognition algorithm on the CV180X/CV181x processor, which can accurately identify the text content contained in the speech from the speech signal. This algorithm is suitable for various voice interaction scenarios, such as voice command recognition, voice search, etc., providing a user-friendly voice interaction experience.

 

  • Acceptance Criteria
  1. Algorithm performance will be evaluated on the evaluation set, focusing mainly on speech recognition accuracy and computational complexity.
  2. Recognition accuracy: Reached 95% speech recognition accuracy on the evaluation set.
  3. FLOPS requirements: The computational complexity of the algorithm (FLOPS) should be adapted to the processor platform and not exceed 35G to ensure efficient operation on embedded devices.

 

 

  • Evaluation set working condition description

The evaluation set will contain multiple scenarios to simulate various situations that may be encountered in actual speech recognition:

  1. Different speakers: Speech signals from different speakers to simulate diverse speech sources.
  2. Different intonations: Speech signals with different intonations and emotions, simulating diverse speech sources.
  3. Speech noise: Introduce background noise and environmental noise to test the robustness of the algorithm in noisy environments.
  4. Changes in speaking speed and volume: Speech signals at different speaking speeds to test the adaptability of the algorithm to speaking speed and volume.
  5. Sample size: The number of samples in the evaluation set should be greater than 500 to ensure adequate evaluation of algorithm performance.