Artificial Intelligence Technique for Speech Recognition Based on Neural Networks



In this area, significant progress has been achieved, but the main problem of modern speech recognition is to achieve the robustness of the process.In this method there are various advantages such as good account of temporal structure of speech signal (shear strength), resistance to the variance signal resistance to noise, low resource consumption, size of the dictionary. but the problem is that for high-quality speech recognition to match these benefits in the same method of recognition 
In speech recognition takes into account the following :

Speech Variation due to: local distortion of scale, interaction (spirit) sounds intonation,the human condition.
·    Speech signal Variability due to: conditions of entry (distance from the recording device, its features, etc.), score of ambient sound (noise).

Speech Technology


Speech recognition is the process of extracting text transcriptions or some form of meaning from speech input.Speech analytics can be considered as the part of the voice processing, which converts human speech into digital forms suitable for storage or transmission computers.
Speech synthesis function is essentially reverse speech analysis-they convert speech data from digital form to one that is similar to the original entry and is suitable for playback.Speech analysis processes can also be called digital speech coding (or encoding) and
The problem of temporary distortions was that speech comparison samples of the same class can be used only if the timescale conversions of one of them. In other words, say the same sound with different durations, and Moreover, the various parts of the sounds may have different duration as part of a class. This effect allows you to talk about “local distortions of scale along the time axis.

Reduce the Value of Artificial Neural Networks


Neural network speech recognition scheme implies a number equal to the number of classes of recognition. Each entry gives a value to indicate the probability of belonging to a given class, or a measure of closeness of this fragment to this speech resolves to sound. For simplicity we confine ourselves to describing a class of pattern recognition. Our reasoning without losses can be transferred to a more general case.
Usually the voice signal is broken into small pieces-frames (segments), each frame is subjected to pre-treatment, for example, by using the window Fourier transform. This is done to reduce space and increase of attributive stripped classes .
As a result, each frame is characterized by the set of coefficients, called acoustic characteristic vector. 

Methods


For the application of the method of reducing the value of artificial neural network for recognition of phonemic tasks we need to choose a target function and analyze its properties.
One of the problems with speech recognition is that you cannot select a stand-alone speech sound. The form of sound is very dependent on the sound environment of a Signal that goes after him, and the sound that goes before him.It is known that waveform is a smooth transition from one sound to another. 

Properties of functions   P(t,Ω):

  P(t,Ω)=0 and P(t,Ω)<εIf the beep sound is not Ω, or is not a speech anyway, where ε on 0 threshold is close to zero;
P (t, Ω) = 1 if the beep is sound;
P (t, Ω) ∈ (0.1) in the zone of phonemic interface, and P is the phonemic seamlessly interface to sound and gradually decreases in the phonemic joint after.

 Conclusion

Use  model of speech recognition based on artificial neural networks. Training the neural network approach is being developed using genetic algorithm. This approach will be implemented in the system identification numbers. Coming to the realization of the system of recognition of voice commands It is also planned to develop system of automatic recognition of speech keywords that are associated with the processing of telephone calls or area security.

Post a Comment

0 Comments