InterPhonic is an industry-leading Text To Speech (TTS) software product. Capable of automatically converting any text into continuous natural voice in real time, the TTS technology is an efficient, fast and convenient approach to providing voice information service for any one at any place, any time, totally meeting the demands of the information age for enormous data, dynamic update and customized query.
This product provides more than 30 languages, such as Mandarin, Cantonese, English, French, German, Portuguese, Italian, Dutch, Spanish, Swedish, Norwegian, Danish, Russian, Hellenic, Korean, Japanese including industry-innovative hybrid reading of Mandarin and English and hybrid reading of Cantonese and English to the need of different voice application services. The multi-engine management interface can provide consistent access interfaces for systems of different languages. The application layer can flexibly and transparently select the required TTS language, and support real-time language switching.
This product may lead a new trend of customized voice service for being capable of providing a rich and diversified tone selection, such as male voice which is firm and deep, female voice which is mature, brisk or gentle, standard female English voice, and even children voice, boy voice, girl voice and aged voice. All tone libraries inherit the sound quality of iFLYTEK’s TTS technology. Taking the specific application service requirement into consideration, a user can select the optimal voice style based on the application scenario. The use can also dynamically switch tones in real time.
The new-generation InterPhonic has a natural voice understanding capability with higher intelligence, capable of automatically analyzing the text and find the right rhythm. It can show the typical speaking manner, such as question and exclamation. The synthesis effect is more expressive. The expression of the speaking tone by the TTS effect is another key innovation of iFLYTEK's voice technology and helps the automatic voice service improve customers' experience through a more customized voice user interface.
InterPhonic TTS product has highly-accurate intelligent text analysis and processing technologies, which assures the synthesis to be accurate, fluent and natural. Moreover iFLYTEK's solid research accumulation on linguistics, close cooperation with authoritative research organizations, enormous professional knowledge database accumulated through wide applications, and ceaseless training and optimization, also help to guarantee that automatic processing results with a high accuracy, even for difficult problems of language analysis processing, such as polyphones, special characters, rhyme phrases, unregistered words (such as place names or person names).
To meet the common demands of mainstream application environments, iFLYTEK collects a wide range of practical language materials, and then makes detailed analysis and professional optimization. InterPhonic works better in common applications of number, values, phrases and short sentences reading, with a clearer voice and right rhythm.
With the TCP/IP-based high-efficiency network TTS service and the centralized resource management mechanism, this product features a structure of client-resource manager–server, for forming flexible schemes. Its high availability has been proved through large-scale key services in diversified major industries, ensuring a 7x24 hour uninterrupted automatic voice service.
For different development tools and different integration requirements/schemes, InterPhonic SDK provides multiple types of development interfaces: Standard development interface (DLL), simple development interface, COM component and SAPI development interface, which enables the developer to choose as required. It also provides abundant development modules and documents, which will speed up the development of speech applications.
This product provides comprehensive parameter setting and adjustment functions and tools, so that users are able to flexibly and efficiently control and manage the TTS effect. It provides the tools for unified configuration and management of global parameters (such as volume, voice speed and pitch), user dictionary, user rules and customized resource package; setting function of numerals, punctuation marks and English pronunciation modes; functions of adding Chinese/English words and specifying the pinyin or phonetic symbol of each word; unified simple and easy graphic user interface for operations and settings. It supports dynamic settings and adjustment through API parameters as well as marking, description and control through Chinese Speech Synthesis Markup Language (CSSML).
The enhancing tool set involves the high-efficiency components which are easy to use, such as offline voice application tool, CSSML visual editing tool and DOC/XLS text format conversion tool.
This product is able to resolute the E-mail in ordinary text format, MIME format or HTML format, synthesize mail subject, sender, receiver, mail contents and text attachment, and automatically determine the reading style according to the context.
This product is equipped with the URI synthesis function, capable of automatically obtaining the network URI text specified by the user, thus to facilitate the use of information resources on the Web.
This product fully supports the GB2312, GBK, BIG5, GB18030, UTF-8 and UNICODE coding character sets, automatically identifies the UNICODE text, and supports direct output into voice data of linear Wav, A/U rate Wave, Vox and other formats of multiple sampling rates (such as 6K/8K/11K/16K).
The server supports such mainstream OSs (Operating Systems) as Windows, Unix and Linux. The client supports such OSs as Microsoft Windows, SUN Solaris, REDHAT Linux, SUSE Linux, HP TRUE64 UNIX, IBM AIX UNIX and VxWork.
iFLYTEK has successful integration cases with all famous platform/equipment manufacturers in the industry. Through close cooperation with platform/equipment providers, system integrators and software developers, iFLYTEK can provide users with professional services in the entire speech application.
It provides the mainstream application environments represented by the customized resource packages (loaded on the combined engine, effectively enhancing the resource set of speech effects in the established application field), CSSML and virtual undefined-length tools for highly-efficient solutions of effect optimization, to enhance the actual application effect significantly. iFLYTEK’s professional service system provides a highly-efficient customization and optimization scheme. It can enhance the customers’ experience and help them to achieve continuous success in speech self-help service.
The Chinese Speech Synthesis Markup Language (CSSML) is a description principle on Chinese speech data put forward and formulated by iFLYTEK, who is playing a leading role. This standard is highly noted and supported by the National 863 Expert Group, National IT Standardization Technical Committee (CITS) and State Bureau of Technical Supervision. It formally passed the inspection of the National Standardization Organization in 2005 and became the important contents of the Chinese speech synthesis technology standards and specifications. Specifically for the Chinese speech application design and expansion, the CSSML can mark and control multiple characteristics and be compatible with the SSML.
InterPhonic provides the innovative and united pre-recording management function in the industry. It uses the pre-recording as resource for the TTS system and makes the matching between pre-recording and synthesized speech easier and link more smooth through smart matching of prompt tones and synthesized templates. At the same time, it avoids the switching and transition problems of frequently processing the prompt tones and TTS. It also simplifies the complexity of application flow and enhances the service effect and quality further.
InterPhonic provides the industry-leading background music function. By the simple tools provided by the system, it enables users to add the background music fast and efficiently, adjust the volume comparison between the background music and synthesized speech, and try the actual effect directly, to make the speech service more friendly and natural.
High-quality effect, flexible application and verified stability allow you to use the self-help voice service to replace the traditional manual service. Higher automatization enables you to provide higher quality services at lower costs.
It can help change the original making mode of speech information. In the scale speech application system, the real-time TTS service with multiple concurrent channels will significantly enhance the timeliness of information update significantly. The information content and range of speech will also be greatly expanded.
The standard Client/Server architecture and perfect system design fully consider the requirements of large-scale speech application and have good scalability. In the expansion, you only need to add new TTS service nodes without changing the original system.
iFLYTEK’s solid overall strength, highly-efficient R&D and technical support team, acknowledged leader position in the industry and wide recognition provide powerful assurance for you to get the support service.