Multimedia Search

Multimedia search is probably one of the most efficient ways of speech recognition.  Automatic recognition of recorded texts and their indexing give you an access to all information without having to listen to recordings. This represents a huge advantage, especially nowadays, when numbers of recordings and volumes of data to be searched have spiralled. With such an archive you do not have to listen to all recordings or restrict your search to manually assigned key words. Quite the opposite, you can find very marginal information otherwise absolutely untraceable when using key word search.

Linguistic part

This technology combines speech recognizer with full text search supplemented by tools for morphological searchthesaurus or even translator. It can be combined with voice search; resulting system is not only searching the speech, but is also controlled by the speech.

Software solution

Just like in all efficient search engines, indexing of all searched recordings must be done first. Then it is possible to search them using newly created indexes.

Speech recognizer, translating spoken words into text, is the key component of this technology (when results of recognition are ambiguous, then also text may be ambiguous). Before the indexing, all texts must be pre-processed through the use of  lemmatisation (i.e. determining the lemma for a given word) for morphological search or translation into index language. The data are then stored in index and queried by a search engine. Queries are also processed e.g. through the  lemmatisation or expansion of forms (i.e. different inflected forms of a word are grouped together so they can be analysed as a single item), expansion of synonyms or translation into index language.

Recognizer is quite demanding in terms of computing devices and its computing power must be properly set in order to be able to convert new indexed recordings into text quick enough. Time necessary to process the recording is a fraction of the time necessary to play it in a conventional way. Thanks to higher number of cores used, several recordings can be processed at the same time. If necessary, recognizers may be run on dedicated servers with sufficient parameters. Other indexing actions are much faster. If search gets hindered by an insufficient disc read speed, go with SSDs or use disc mirroring.