Skip to content

Speed-up bitmap operations on images. #5856

@darth-vader-lg

Description

@darth-vader-lg

I found something that it could be improved, while I was working on my object detection applications.
I dicovered that Microsoft.ML.ImageAnalytics uses the GetPixel and SetPixel functions to access to the images' bitmap data.
It's notorious that such kind of functions are very slow compared to a raw access to the image's data buffer (up to 10 time slower).
It has a big evidence when we work on object recognition, where a huge amount of images and frames must be processed.

My proposal is to implement the raw access to speed-up all the operations.
I'm gonna submit a PR ASAP with the needed changes.

I prepared a test to check if it really speed up the process and... yes, of course, it did it.

My code for the test I did:

[TensorFlowFact]
public void TensorFlowTransformObjectDetectionLoopTest()
{
    // Saved model
    var modelLocation = @"D:\ObjectDetection\carp\TensorFlow\exported-model-SSD-MobileNET-v2-320x320\saved_model";
    // Create the estimators pipe
    var pipe = 
        _mlContext.Transforms.LoadImages(
            inputColumnName: "ImagePath",
            outputColumnName: "Image",
            imageFolder: "")
        .Append(_mlContext.Transforms.ResizeImages(
            inputColumnName: "Image",
            outputColumnName: "ResizedImage",
            imageWidth: 300,
            imageHeight: 300,
            resizing: ImageResizingEstimator.ResizingKind.Fill))
        .Append(_mlContext.Transforms.ExtractPixels(
            inputColumnName: "ResizedImage",
            outputColumnName: "serving_default_input_tensor:0",
            interleavePixelColors: true,
            outputAsFloatArray: false))
        .Append(_mlContext.Model.LoadTensorFlowModel(modelLocation).ScoreTensorFlowModel(
            inputColumnNames: new[] { "serving_default_input_tensor:0" },
            outputColumnNames: new[]
            {
                "StatefulPartitionedCall:1" /* detection_boxes */,
                "StatefulPartitionedCall:2" /* detection_classes */,
                "StatefulPartitionedCall:4" /* detection_scores */
            }));

    // Collect all the path of the images in the test directory
    var imagesLocation = @"D:\ObjectDetection\carp\TensorFlow\images\test";
    var images =
        Directory.GetFiles(imagesLocation).Where(file => new[] { ".jpg", ".jfif" }
        .Any(ext => Path.GetExtension(file).ToLower() == ext))
        .Select(file => new { ImagePath = file })
        .ToArray();

    // Create the transformer
    var data = _mlContext.Data.LoadFromEnumerable(images.Take(0));
    var model = pipe.Fit(data);

    // Test n times the inference on the collected images
    for (int i = 0, nImage = 0; i < 1000; i++, nImage = (nImage + 1) % images.Length)
        model.Transform(_mlContext.Data.LoadFromEnumerable(new[] { images[nImage] })).Preview();
}

Without optimization (current)

WithoutOptimization

With raw access optimization

WithRawAccessOptimization

So the speed-up ratio on a set of small images is about 182%.
On larger images it could be also more.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions