Issue #2 new

Slow

mikelee
created an issue

There is a delay on

[fliteEngine speakText]

Is it possible to make it instantaneous?

Comments (4)

  1. Sam Foster repo owner

    It depends on your hardware and which voice you're using. On 1st genneration devices (and possibly 2nd) the only voice that is instantaneous is the cmu_us_kal.

  2. mikelee reporter

    Looking at the code it's saving the sound to disk as temp.wav as a work-around in order for the sound to work, that's probably why there's a delay. If we can get the sound to play from memory then the delay should go away. I'm using an ipod touch 4, the performance of that device should be reasonable.

  3. Anonymous

    I did a trick to solve it. Just run the flite at your applicationDidFinishLaunching with an empty voice and then call again anywhere in your app, it will not take time to create the file etc. Also do one thing that call this method in your custom method and call that custom method in performSelectorInBackground, it will save a lot of time and memory also.

    Happy Coding!

  4. Peter Strömberg

    I have an iPhone 1 for testing. Indeed the cmu_us_kal voice is pretty instantaneous. Using other voices it takes time to create the wave. Not much time is spent on writing it to file and playing it from there.

    I also saw the cry for help with getting it to play from memory. It's not as simple as copying the cst_wave struct as it is only a few bytes where one of the members point to the samples. Looking at the code which writes the wave to file one can see that it is not entirely trivial. I think much of the same work needs to be done to "write" it to a byte array that AVPlayer accepts. But, as I noted above, I'm not sure it's worth the effort (from a speed point of view at least).

    If you keep the file-based approach you can provide an interface for preparing the tts and playing it later. Something like so:

    -(NSString*)tempPath {
      NSArray *filePaths = NSSearchPathForDirectoriesInDomains (NSDocumentDirectory, NSUserDomainMask, YES);
    	NSString *recordingDirectory = [filePaths objectAtIndex: 0];
    	// Pick a file name
    	return [NSString stringWithFormat: @"%@/%s", recordingDirectory, "temp.wav"];
    }
    
    -(void)prepareToSpeakText:(NSString *)text
    {
    	NSMutableString *cleanString;
    	cleanString = [NSMutableString stringWithString:@""];
    	if([text length] > 1)
    	{
    		int x = 0;
    		while (x < [text length])
    		{
    			unichar ch = [text characterAtIndex:x];
    			[cleanString appendFormat:@"%c", ch];
    			x++;
    		}
    	}
    	if(cleanString == nil)
    	{	// string is empty
    		cleanString = [NSMutableString stringWithString:@""];
    	}
    	cst_wave* _sound = flite_text_to_wave([cleanString UTF8String], voice);
    	
      NSString *tempFilePath = [self tempPath];
    	cst_wave_save_riff(_sound, (char*)[tempFilePath UTF8String]);
    	delete_wave(_sound);
    }
    
    -(void)playPreparedSpeech {
      NSString *tempFilePath = [self tempPath];
    	NSError *err;
    	[audioPlayer stop];
    	audioPlayer =  [[AVAudioPlayer alloc] initWithContentsOfURL:[NSURL fileURLWithPath:tempFilePath] error:&err];
    	[audioPlayer setDelegate:self];
    	//[audioPlayer prepareToPlay];
    	[audioPlayer play];
    	// Remove file
    	[[NSFileManager defaultManager] removeItemAtPath:tempFilePath error:nil];
    }
    
    -(void)speakText:(NSString *)text {
      [self prepareToSpeakText:text];
      [self playPreparedSpeech];
    }
    

    The preview of that code snippet looks horrible. Trying to post it anyway. =)

  5. Log in to comment