Synthesising Speech in Python

There’s a Scottish company called CereProc who do some of the best speech synthesis in the world. They excel in regional accents, especially difficult Scottish ones! I’ve been using their CereVoice Cloud SDK in some recent projects (like Speek). In this post I’m going to share a wee Python script and an Android class for using their cloud API to generate synthesised speech. To use these, you’ll need to create a (free) account over on CereProc’s developer site and then add your auth credentials to the code.

Downloading Speech in Python

Call the download() function with the message you wish to synthesise, optionally specifying which voice to use, which file format to use and what to name the file.

Downloading and Playing Speech in Android

Create a CereCloudPlayer object and use its play method to request, download, and play the message you wish to synthesise.

PyQT: QPixmap and threads

I’ve been working with PyQT lately and got stuck on a seemingly simple problem: updating the UI from another thread. Having never used PyQT before it wasn’t obvious what the solution was and any Stack Overflow results I found gave incomplete code samples. I’m hoping this post helps give pointers for anyone searching for the same things I did.

This particular example is very contrived but it’s the only solution I could find for updating an image with QPixmap objects in a multithreaded interface, overcoming the “QPixmap: It is not safe to use pixmaps outside the GUI thread” error message. I think part of my problem was that I wasn’t using QThreads in my threaded code and I wasn’t willing to refactor a large codebase just to improve PyQT integration.

First another thread calls someFunctionCalledFromAnotherThread, which uses PyQT’s signal mechanism to pass events across threads. This function creates a LoadImageThread with the filename and desired size as arguments, connects it to a signal to call the showImage function, then starts the thread.

def someFunctionCalledFromAnotherThread(self):
  thread = LoadImageThread(file="test.png", w=512, h=512)
  self.connect(thread, QtCore.SIGNAL("showImage(QString, int, int)"), self.showImage)
  thread.start()

def showImage(self, filename, w, h):
  pixmap = QtGui.QPixmap(filename).scaled(w, h)
  self.image.setPixmap(pixmap)
  self.image.repaint()

LoadImageThread then does nothing other than emit a response to the showImage signal we connected above, passing the thread arguments back. This means showImage will be executed on the GUI thread, avoiding those nasty QPixmap errors. Note the __del__ function below; that prevents the thread from being garbage collected while running.

class LoadImageThread(QtCore.QThread):
  def __init__(self, file w, h):
    QtCore.QThread.__init__(self)
    self.file = file
    self.w = w
    self.h = h

  def __del__(self):
    self.wait()

  def run(self):
    self.emit(QtCore.SIGNAL('showImage(QString, int, int)'), self.file, self.w, self.h)

There we have it – a stupid and contrived solution to a stupid problem.

OpenKinect Python and OpenCV

I’ve spent the past day or so messing around with Kinect and OSX, trying to find a nice combination of libraries and drivers which works well – a more difficult task than you’d imagine! Along the way I’ve found that a lot of these libraries have poor or no documentation.

Here I’m sharing a little example of how I got OpenKinect and OpenCV working together in Python. The Python wrapper for OpenKinect gives depth data as a numpy array which conveniently is the datatype used in the cv2 module.

import freenect
import cv2
import numpy as np

"""
Grabs a depth map from the Kinect sensor and creates an image from it.
"""
def getDepthMap():	
	depth, timestamp = freenect.sync_get_depth()

	np.clip(depth, 0, 2**10 - 1, depth)
	depth >>= 2
	depth = depth.astype(np.uint8)

	return depth

while True:
	depth = getDepthMap()

	blur = cv2.GaussianBlur(depth, (5, 5), 0)

	cv2.imshow('image', blur)
	cv2.waitKey(10)

Here the getDepthMap function takes the depth map from the Kinect sensor, clips the array so that the maximum depth is 1023 (effectively removing distance objects and noise) and turns it into an 8 bit array (which OpenCV can render as grayscale). The array returned from getDepthMap can be used like a grayscale OpenCV image – to demonstrate I apply a Gaussian blur. Finally, imshow renders the image in a window and waitKey is there to make sure image updates actually show.

This is by no means a comprehensive guide to using freenect and OpenCV together but hopefully it’s useful to someone as a starting point!