I am working on a small Android application to transfer frames from a camera (like a JPEG series) to my computer. Without processing, the frame buffer receives preliminary images of the camera at a speed of about 18 frames per second. When i add
YuvImage yuv = new YuvImage(data, ImageFormat.NV21, dimensions.width, dimensions.height, null);
yuv.compressToJpeg(new Rect(0, 0, dimensions.width, dimensions.height), 40, out);
the frame rate drops to about 7 frames per second. So I thought that I would write my own JPEG encoder in C and speed things up a bit. Well, I was surprised. Now I get 0.4 frames per second!
So now I need to profile and optimize my C code, but I donβt know where to start. I use these GCC flags:
-Wall -std=c99 -ffast-math -O3 -funroll-loops
Is there anything that I can improve there?
Other than that, my JPEG encoder is just a direct implementation. Record header information, record quantization, and Huffman tables, then entropy encodes the data. DCT uses the AA & N method, which I think is the fastest way to do this.
Perhaps there is a problem with JNI invoices?
I allocate memory in Java using:
frame_buffer = ByteBuffer.allocate(raw_preview_buffer_size).array();
jpeg_buffer = ByteBuffer.allocate(10000000).array();
and then pulling it with this code (have mercy on spaghetti at the moment):
void Java_com_nechtan_limelight_activities_CameraPreview_handleFrame(JNIEnv* env, jobject this, jbyteArray nv21data, jbyteArray jpeg_buffer) {
jboolean isCopyNV21;
jboolean isCopyJPEG;
int jpeg_size = 0;
jbyte* nv21databytes = (*env)->GetByteArrayElements(env, nv21data, &isCopyNV21);
jbyte* jpeg_buffer_bytes = (*env)->GetByteArrayElements(env, jpeg_buffer, &isCopyJPEG);
if (nv21databytes != NULL) {
if (jpeg_buffer_bytes != NULL) {
jpeg_size = compressToJpeg((UCHAR*) nv21databytes, (UCHAR*) jpeg_buffer_bytes, 640, 480);
(*env)->ReleaseByteArrayElements(env, jpeg_buffer, jpeg_buffer_bytes, 0);
(*env)->ReleaseByteArrayElements(env, nv21data, nv21databytes, JNI_ABORT);
}
else {
__android_log_print(ANDROID_LOG_DEBUG, DEBUG_TAG, "JPEG data null!");
}
}
else {
__android_log_print(ANDROID_LOG_DEBUG, DEBUG_TAG, "NV21 data null!");
}
}
Am I doing something ineffective here? What is a good way to profile JNI code?
Beyond these things, the only thing I can think of is what I have to read about NEON and vectorize this stuff. Ugh ...