Android 6.0 Multipart HTTP POST

This how-to shows you how to use a multipart HTTP POST request to upload a file and metadata to a web server. Android 6.0 removed support for legacy HTTP libraries, so a lot of examples I found online are outdated (or require adding the legacy libraries). This solution uses the excellent OkHttp library from Square – because instead of adding legacy libraries for the old method, you should add a new library that’ll also save you a lot of work!

Step 1: Add OkHttp to your gradle build script

In Android Studio, open the build.gradle script for your main project module and add OkHttp to your dependencies:

dependencies {
    compile 'com.squareup.okhttp3:okhttp:3.5.0'
}

Step 2: Create and execute an HTTP request

This example shows how to upload the contents of a File object to a server, with a username and date string as metadata.

String UPLOAD_URL = "http://yoururl.com/example.php";

// Example data
String username = "test_user_123";
String datetime = "2016-12-09 10:00:00";
File image = getImage();

// Create an HTTP client to execute the request
OkHttpClient client = new OkHttpClient();

// Create a multipart request body. Add metadata and files as 'data parts'.
RequestBody requestBody = new MultipartBody.Builder()
        .setType(MultipartBody.FORM)
        .addFormDataPart("username", username)
        .addFormDataPart("datetime", datetime)
        .addFormDataPart("image", image.getName(),
                RequestBody.create(MediaType.parse("image/jpeg"), image))
        .build();

// Create a POST request to send the data to UPLOAD_URL
Request request = new Request.Builder()
        .url(UPLOAD_URL)
        .post(requestBody)
        .build();

// Execute the request and get the response from the server
Response response = null;

try {
    response = client.newCall(request).execute();
} catch (IOException e) {
    e.printStackTrace();
}

// Check the response to see if the upload succeeded
if (response == null || !response.isSuccessful()) {
    Log.w("Example", "Unable to upload to server.");
} else {
    Log.v("Example", "Upload was successful.");
}

Summary

OkHttp is awesome because it removes a lot of the heavy lifting necessary to work with HTTP requests in Android. Construct your request content using Java objects and it’ll do the rest for you. If you’re looking for a replacement for the HTTP libraries deprecated in Android 6.0, I strongly recommend this one.

PyQT: QPixmap and threads

I’ve been working with PyQT lately and got stuck on a seemingly simple problem: updating the UI from another thread. Having never used PyQT before it wasn’t obvious what the solution was and any Stack Overflow results I found gave incomplete code samples. I’m hoping this post helps give pointers for anyone searching for the same things I did.

This particular example is very contrived but it’s the only solution I could find for updating an image with QPixmap objects in a multithreaded interface, overcoming the “QPixmap: It is not safe to use pixmaps outside the GUI thread” error message. I think part of my problem was that I wasn’t using QThreads in my threaded code and I wasn’t willing to refactor a large codebase just to improve PyQT integration.

First another thread calls someFunctionCalledFromAnotherThread, which uses PyQT’s signal mechanism to pass events across threads. This function creates a LoadImageThread with the filename and desired size as arguments, connects it to a signal to call the showImage function, then starts the thread.

def someFunctionCalledFromAnotherThread(self):
  thread = LoadImageThread(file="test.png", w=512, h=512)
  self.connect(thread, QtCore.SIGNAL("showImage(QString, int, int)"), self.showImage)
  thread.start()

def showImage(self, filename, w, h):
  pixmap = QtGui.QPixmap(filename).scaled(w, h)
  self.image.setPixmap(pixmap)
  self.image.repaint()

LoadImageThread then does nothing other than emit a response to the showImage signal we connected above, passing the thread arguments back. This means showImage will be executed on the GUI thread, avoiding those nasty QPixmap errors. Note the __del__ function below; that prevents the thread from being garbage collected while running.

class LoadImageThread(QtCore.QThread):
  def __init__(self, file w, h):
    QtCore.QThread.__init__(self)
    self.file = file
    self.w = w
    self.h = h

  def __del__(self):
    self.wait()

  def run(self):
    self.emit(QtCore.SIGNAL('showImage(QString, int, int)'), self.file, self.w, self.h)

There we have it – a stupid and contrived solution to a stupid problem.

PyOpenNI and OpenCV

In my last post I gave an example of how to use OpenKinect to get a depth stream which can be manipulated with OpenCV in Python. This example uses PyOpenNI instead, which is more powerful as it exposes useful OpenNI functionality.

The gist of this example is that PyOpenNI abstracts the depth map behind a DepthMap object and for OpenCV we need to turn that into a numpy array. As DepthMap can be iterated over, the most obvious solution is to just call numpy.asarray with the depth map; however, I’ve found that to be far too slow for practical use.

A much faster way is to use the older raw depth stream functions from PyOpenNI’s DepthGenerator class. These functions – get_raw_depth_map and get_raw_depth_map_8 – return byte strings of 16 and 8 bits per pixel, respectively. Numpy can create an array from these byte strings significantly faster than just calling numpy.asarray with the DepthMap object.

from openni import *
import numpy as np
import cv2

# Initialise OpenNI
context = Context()
context.init()

# Create a depth generator to access the depth stream
depth = DepthGenerator()
depth.create(context)
depth.set_resolution_preset(RES_VGA)
depth.fps = 30

# Start Kinect
context.start_generating_all()
context.wait_any_update_all()

# Create array from the raw depth map string
frame = np.fromstring(depth.get_raw_depth_map_8(), "uint8").reshape(480, 640)

# Render in OpenCV
cv2.imshow("image", frame)

In this example a single frame is taken from the Kinect depth stream and turned into an OpenCV-compatible numpy array. This could be useful as a starting point for a Kinect finger tracker because it allows OpenNI’s skeleton and hand tracking functionality to be used alongside OpenCV. As an example, OpenNI could be used to track current hand position so that OpenCV knows which region of the depth map (or rgb stream – this code is easily modifiable to use that instead) contains a hand. Computer vision techniques could then be used to look for fingers in that region – removing the need to try and segment using colour when looking for hands in a normal rgb image.

OpenKinect Python and OpenCV

I’ve spent the past day or so messing around with Kinect and OSX, trying to find a nice combination of libraries and drivers which works well – a more difficult task than you’d imagine! Along the way I’ve found that a lot of these libraries have poor or no documentation.

Here I’m sharing a little example of how I got OpenKinect and OpenCV working together in Python. The Python wrapper for OpenKinect gives depth data as a numpy array which conveniently is the datatype used in the cv2 module.

import freenect
import cv2
import numpy as np

"""
Grabs a depth map from the Kinect sensor and creates an image from it.
"""
def getDepthMap():	
	depth, timestamp = freenect.sync_get_depth()

	np.clip(depth, 0, 2**10 - 1, depth)
	depth >>= 2
	depth = depth.astype(np.uint8)

	return depth

while True:
	depth = getDepthMap()

	blur = cv2.GaussianBlur(depth, (5, 5), 0)

	cv2.imshow('image', blur)
	cv2.waitKey(10)

Here the getDepthMap function takes the depth map from the Kinect sensor, clips the array so that the maximum depth is 1023 (effectively removing distance objects and noise) and turns it into an 8 bit array (which OpenCV can render as grayscale). The array returned from getDepthMap can be used like a grayscale OpenCV image – to demonstrate I apply a Gaussian blur. Finally, imshow renders the image in a window and waitKey is there to make sure image updates actually show.

This is by no means a comprehensive guide to using freenect and OpenCV together but hopefully it’s useful to someone as a starting point!

Heuristics for smarter Leap Motion tracking

Leap Motion is quite easy to develop for but I find that hand and finger tracking can be tricky – especially for more precise gesture control. I’ve been doing some development with Leap that requires precise positioning (~mm accuracy needed) with multiple fingers. In this post I run through a few strategies I use to try and improve my Leap applications.

1) Track hand and finger identifiers

Each Finger and Hand object has a unique identifier which should remain consistent between frames, so long as the object continues to be tracked. If you can make a reasonable assumption about which finger(s) the user is intending to use for interaction, you can look for a specific identifier to find the finger(s) of interest, ignoring anything else in the frame. I find that when extending their index finger for interaction, most users tend to hold their thumb out unwittingly; which Leap detects, of course.

I use a simple heuristic to try address this issue: pick the finger with the smallest z-coordinate (furthest forward) and track its identifier in future frames. When I want to track multiple fingers for interaction, I make sure those fingers are attached to the hand whose identifier I’m tracking. This rejects other unattached fingers in the field of view, e.g. the cylindrical lights in my office at university! (Seriously…)

2) Check finger lengths

Sometimes Leap sees your knuckles in a closed fist and reports them as fingers. The looser the grip, the worse the problem. Checking that the length of each finger is above a threshold length can help you reject misclassified knuckles and clenched fingers. A threshold of 2 cm or thereabouts should be fine; any longer and you risk rejecting small fingers or fingers which are fully extended but Leap just isn’t tracking properly.

3) Look at direction vectors for unlikely values

The biggest problem I have with Leap is that it picks up fingers in a closed fist. Looking at the debug visualiser, I tend to see fingers pointing upwards through the fist, rather than an accurate representation of a finger in a clenched fist. These “phantom” fingers tend to have extreme y values in the direction vector (> 0.6). Through observation of some of my users I’ve also noticed that when using a certain hand pose for input, “inactive” fingers are left to hang down; which Leap picks up perfectly. These fingers typically point downwards, with a high y value in the direction vector (< -0.6).

A heuristic I use to reject “phantom” fingers and fingers the user probably doesn’t want to use for input is to take the absolute y value from the direction vector and ignore the finger if this value exceeds a threshold. I use a threshold around 0.4. A limitation of this heuristic is that the user has to keep their hand quite flat. To address this, you could instead measure finger direction relative to the hand orientation and then apply the threshold.

Summary

Sometimes Leap gives crap data and we also can’t expect our users to keep a perfectly clenched fist. A few simple checks on each frame update can help us reject fingers which either don’t exist or weren’t intended to be part of the interaction.

Creating Tactons in Pure Data

What are Tactons?

Tactons are “structured tactile messages” for communicating non-visual information. Structured patterns of vibration can be used to encode information, for example a quick buzz to tell me that I have a new email or a longer vibration to let me know that I have an incoming phone call.

Vibrotactile actuators are often used in HCI research to deliver Tactons as these provide a higher fidelity of feedback than the simple rotation motors used in mobile phones and videogame controllers. Sophisticated actuators allow us to change more vibrotactile parameters, providing more potential dimensions for Tacton design. Whereas my previous example used the duration of vibration to encode information (short = email, long = phone call), further information could also be encoded using a different vibrotactile parameter. Changing the “roughness” of the feedback could be used to indicate how important an email or phone call is, for example.

How do we create Tactons?

Now that we know what Tactons are and what they could be used for, how do we actually create them? How can we drive a vibrotactile actuator to produce different tactile sensations?

Linear and voice-coil actuators can be driven by providing a voltage but, rather than dabble in electronics, the HCI community typically uses audio signals to drive the actuator. A sine wave, for example, produces a smooth and continuous-feeling sensation. For more information on how audio signal parameters can be used to create different vibrotactile sensations, see [1], [2] and [3].

Tactons can be created statically using an audio synthesiser or a sound editing program like Audacity to generate sine waves, or can be created dynamically using Pure Data. The rest of this post is going to be a quick summary of Pure Data components which can be used in creating vibrotactile feedback in real-time. I’ve just provided an overview of the key components which I use when creating tactile feedback. With the components discussed, the following vibrotactile parameters can be manipulated: frequency, spatial location, amplitude, “roughness” (with amplitude modulation) and duration.

Tactons with Pure Data components

osc~ Generates a sine-wave. First inlet or argument can be used to set the frequency of the sine-wave, e.g. osc~ 250 creates a 250 Hz signal.

dac~ Audio output. First argument specifies the number of channels and each inlet is used to send an incoming signal to that channel, e.g. dac~ 4 creates a four-channel audio output. Driving different actuators with different audio channels can allow vibration to be encoded spatially.

*~ Multiply signal. Multiplies two signals to produce a single signal. Amplitude modulation (see [2] and [3] above) can be used to create different textures by multiplying two sine waves together. Multiplying osc~ 250 with osc~ 30 creates quite a “rough” feeling texture. This can also be used to change the amplitude of a signal. Multiplying by 0 silences the signal. Multiplying by 0.5 reduces amplitude by 50%. Tactons can be turned on and off by multiplying the wave by 1 and 0, respectively.

delay Sends a bang after a delay. This can be used to provide precise timings for tacton design. To play a 300 ms vibration, for example, an incoming bang could send 1 to the hot inlet of *~, enabling the tacton. Sending that same bang to delay 300 would send a bang after 300 ms, which could then send 0 to the cold inlet of *~, ending the tacton.

phasor~ Creates a ramping waveform. Can be used to create sawtooth waves. This tutorial explains how this component can also be used to create square waveforms.

Multimodal Android Development Part 1

This post is the first of two which gives a brief introduction to creating multimodal interactions in Android applications. I’ll briefly cover some of the SDK features available to you as an Android developer which you can use to create richer interactions in your apps. Example code will be quite concise because I assume you have at least a basic knowledge of Android development. Feel free to leave any comments suggesting how I can better explain these concepts, or to let me know if I’ve made any mistakes or omissions.

What is “multimodal” interaction?

Multimodal interaction, put simply, is interaction involving more than one modality (e.g. multiple senses). For example, an application may provide a combination of visual and haptic (touch) feedback. These types of interaction design provide a number of benefits, for example allowing those with sensory impairment to interact using other senses, or allowing interaction in contexts where one sense may be otherwise occupied.

One of the most ubiquitous examples of a multimodal interaction is the way in which mobile phones combine visual, audible and haptic feedback to inform users of a new text, phone call, etc. This combination of modalities is particularly useful when your phone is, say, in your pocket. Obviously you can’t see the phone, but you will probably feel the phone vibrate or hear your ringtone as new notifications appear.

Haptic feedback in Android

Most handheld Android devices have some sort of rotation motor in them allowing simple haptic feedback. Although not common in tablets (largely due to size constraints), all modern Android phones will have tactile feedback available. You can control the phone vibrator through the Vibrator class. Note that in order to use this, your Manifest must request the following permission: android.permission.VIBRATE

/* Request the device's vibrator service. Remember to check
 * for null return value, in case this isn't available. */
Vibrator vibrator = (Vibrator) getSystemService(Context.VIBRATOR_SERVICE);

/* Two ways to control the vibrator:
 *  1. Turn on for a specific time
 *  2. Provide a vibration pattern */

/* 1. Vibrate for 200ms */
vibrator.vibrate(200);

/* 2. Vibrate for 200ms, pause for 100ms, vibrate for 300ms. */
long[] pattern = new long[] {0, 200, 100, 300};

/* Perform this pattern once only (repeat := -1). */
vibrator.vibrate(pattern, -1);

/* Vibrate for 200ms, followed by indefinite repeat of
 * 100ms pause followed by 300ms vibrate. Setting
 * repeat := 2 tells the vibrator to repeat at offset
 * 2 into the vibration pattern. */
vibrator.vibrate(pattern, 2);

 

Touchscreen gestures

Using touchscreen gestures to interact with applications can be fun, efficient and useful when users may be unable to select a particular action on the screen. For example, it can be difficult to select a button on-screen when running or walking. A touch gesture, however, is a lot easier and requires less precision from the user. The disadvantage with touch gestures is that if not used sparingly, there may be too much for the user to remember!
Creating a set of gestures for your application is simple: create a gesture library on an Android Virtual Device using the Gesture Builder application (available on the AVD by default) and add a GestureOverlayView to your activity layout. In your activity, you just have to load the gesture library from your resources and implement an OnGesturePerformedListener.

 

private GestureLibrary mLibrary;

public void onCreate(Bundle savedInstanceState) {
  ...
  /* 1. Load gesture library from the res/raw/gestures file */
  mLibrary = GestureLibraries.fromRawResource(this, R.raw.gestures);

  if (!mLibrary.load())
    /* Error: unable to load from resources! */
    ...

  /* 2. Find reference to the gesture overlay view */
  GestureOverlayView gov = (GestureOverlayView) findViewById(R.id.gestureOverlay);

  /* 3. Register callback for gesture input */
  gov.addOnGesturePerformedListener(this);
}

The callback method for gesture performance receives a Gesture as an argument. This can be used to obtain a list of predictions: which gestures in your library that Android thought the gesture was. With these predictions, you can use the prediction score (or contextual information) to determine which gesture the user was most likely to have performed. I find it useful to define a threshold for gesture acceptance, so that you can reject erroneous or inaccurate gestures. The best way to choose this threshold value is through trial and error: see what works for you and your gestures.

private static final double ACCEPTANCE_THRESHOLD = 10.0;

public void onGesturePerformed(GestureOverlayView overlay, Gesture gesture) {
  /* 1. Get list of gesture predictions */
  ArrayList predictions = mLibrary.recognize(gesture);

  if (predictions.size() > 0) {
    /* 2. Find highest scoring prediction */
    Prediction bestPrediction = predictions.get(0);

    for (int i = 1; i < predictions.size(); i++) {
      Prediction p = predictions.get(i);
      if (p.score > bestPrediction.score)
        bestPrediction = p;
    }

    /* 3. Decide if we'll accept this gesture */
    if (bestPrediction.score > ACCEPTANCE_THRESHOLD)
      gestureAccepted(bestPrediction.name);
  }
}

private void gestureAccepted(String gestureName) {
  /* Respond appropriately to the gesture name */
  ...
}

 

Saving map images in Android

Recently I’ve been working on a little Android project and wanted to save thumbnail images of a map within the application. This post is just sharing how to do exactly that. Nothing too complicated. 

public class MyMapActivity extends MapActivity {
    private MapView mapView;

    ...

    private Bitmap getMapImage() {
        /* Position map for output */
        MapController mc = mapView.getController();
        mc.setCenter(SOME_POINT);
        mc.setZoom(16);

        /* Capture drawing cache as bitmap */
        mapView.setDrawingCacheEnabled(true);
        Bitmap bmp = Bitmap.createBitmap(mapView.getDrawingCache());
        mapView.setDrawingCacheEnabled(false);

        return bmp;
    }

    private void saveMapImage() {
        String filename = "foo.png";
        File f = new File(getExternalFilesDir(null), filename);
        FileOutputStream out = new FileOutputStream(f);

        Bitmap bmp = getMapImage();

        bmp.compress(Bitmap.CompressFormat.PNG, 100, out);

        out.close();
    }
}

In the getMapImage method, we’re telling the map controller to move to a particular point (this may not matter to you, you may just want to take the image as it appears) and zooming in to show a sufficient level of detail. Then a Bitmap is created from the map view’s drawing cache. The saveMapImage method is just an example of how you may want to save an image to the application’s external file directory.

Left-recursion in Parsec

Lately I’ve been using the Parsec library for Haskell to write a parser and interpreter for a university assignment. Right-recursive grammars are trivial to parse with combinatorial parsers; tail recursion and backtracking make this simple. However, implementing a left-recursive grammar will often result in an infinite loop, as is the case in Parsec when using basic parsers. 

Parsec does support left-recursion however. Unsatisfied with the lack of good tutorials when I googled for advice, I decided to write this. Hopefully it helps someone. If I can make this better or easier to understand, please let me know!

Left recursive parsing can be achieved in Parsec using chainl1.

chainl1 :: Stream s m t => ParsecT s u m a -> ParsecT s u m (a -> a -> a) -> ParsecT s u m a

As an example of how to use chainl1, I’ll demonstrate its use in parsing basic integer addition and subtraction expressions.

First we’ll need an abstract syntax tree to represent an integer expression. This can represent a single integer constant, or an addition / subtraction operation which involves two integer expressions.

data IntExp = Const Int
            | Add IntExp IntExp
            | Sub IntExp IntExp

If addition and subtraction were to be right-associative, we’d parse the left operand as a constant, and attempt to parse the right operand as another integer expression. Upon failing, we’d backtrack and instead attempt to parse an integer constant. Reversing this approach to make the expressions left-associative would cause infinite recursion; we’d attempt to parse the left operand as an integer expression, which attempts to parse the left operand as an integer expression, which tries to… you get the point.

Instead we use chainl1 with two parsers; one to parse an integer constant, and another which parses a symbol and determines if the expression is an addition or subtraction.

parseIntExp :: Parser IntExp
parseIntExp =
  chainl1 parseConstant parseOperation

parseOperation :: Parser (IntExp -> IntExp -> IntExp)
parseOperation =
  do spaces
     symbol <- char '+' <|> char '-'
     spaces
     case symbol of
       '+' -> return Add
       '-' -> return Sub

parseConstant :: Parser IntExp
parseConstant =
  do xs <- many1 digit
     return $ Const (read xs :: Int)

Here, parseOperation returns either the Add or Sub tag of IntExp. Using GHCi, you can confirm the type of Add as:

Add :: IntExp -> IntExp -> IntExp

So, we have a parser which will parse a constant and a parser which will parse a symbol and determine what type of operation an expression is. In parseIntExp, chainl1 is the glue which brings these together. This is what allows left-associative parsing without infinitely recursing.

A complete code sample is available here. The abstract syntax tree has been created as an instance of the Show typeclass to print in a more readable format, which shows that the grammar is indeed left-associative.

ghci>  run parseIntExp "2 + 3 - 4"
((2+3)-4)

Finding maximum zero submatrix with C#

After spending most of this evening attempting to solve this problem, it only feels right that I share my solution.

Suppose you have a binary matrix and you wish to find the largest zero submatrix, i.e. the largest rectangle of zeroes in the matrix (see below, highlighted in orange).

1 1 0 0 0 1 0
1 0 0 0 0 1 1
1 1 0 1 0 1 1
1 1 1 0 1 1 0

The brute-force approach to this problem isn’t particularly efficient, with complexity O(n2m2). It involves looking at every position in the matrix, taking that position as an arbitrary point of the submatrix (e.g. the top-left point) and then testing all possible rectangles which originate at that point (e.g. all possible bottom-right points).

This algorithm finds the largest zero submatrix by looking at each position in turn and attempting to “grow” the submatrix up, to the left, and to the right. A dynamic programming approach is used and it keeps track of where the value 1 occurs. Array d contains the previous row where a 1 was found for each column. Array d1 contains the position of the left borders, and d2 contains the position of the right borders.

A complete code snippet for this program is available here. This has code to output the results and shows how to make use of the values calculated in this function.

static void MaxSubmatrix(int[,] matrix)
{
 int n = matrix.GetLength(0); // Number of rows
 int m = matrix.GetLength(1); // Number of columns

 int maxArea = -1, tempArea = -1;

 // Top-left corner (x1, y1); bottom-right corner (x2, y2)
 int x1 = 0, y1 = 0, x2 = 0, y2 = 0;

 // Maximum row containing a 1 in this column
 int[] d = new int[m];

 // Initialize array to -1
 for (int i = 0; i < m; i++)
 {
  d[i] = -1;
 }

 // Furthest left column for rectangle
 int[] d1 = new int[m];

 // Furthest right column for rectangle
 int[] d2 = new int[m];

 Stack stack = new Stack();

 // Work down from top row, searching for largest rectangle
 for (int i = 0; i < n; i++)
 {
  // 1. Determine previous row to contain a '1'
  for (int j = 0; j < m; j++)
  {
   if (matrix[i,j] == 1)
   {
    d[j] = i;
   }
  }

  stack.Clear();

  // 2. Determine the left border positions
  for (int j = 0; j < m; j++)
  {
   while (stack.Count > 0 && d[stack.Peek()] <= d[j])
   {
    stack.Pop();
   }

   // If stack is empty, use -1; i.e. all the way to the left
   d1[j] = (stack.Count == 0) ? -1 : stack.Peek();

   stack.Push(j);
  }

  stack.Clear();

  // 3. Determine the right border positions
  for (int j = m - 1; j >= 0; j--)
  {
   while (stack.Count > 0 && d[stack.Peek()] <= d[j])
   {
    stack.Pop();
   }

   d2[j] = (stack.Count == 0) ?  m : stack.Peek();

   stack.Push(j);
  }

  // 4. See if we've found a new maximum submatrix
  for (int j = 0; j < m; j++)
  {
   // (i - d[j]) := current row - last row in this column to contain a 1
   // (d2[j] - d1[j] - 1) := right border - left border - 1
   tempArea = (i - d[j]) * (d2[j] - d1[j] - 1);

   if (tempArea > maxArea)
   {
    maxArea = tempArea;

    // Top left
    x1 = d1[j] + 1;
    y1 = d[j] + 1;

    // Bottom right
    x2 = d2[j] - 1;
    y2 = i;
   }
  }
 }
}