Image Categorization using Object Detection in TensorFlow.js

I have often wondered if I can build a tool that can accept custom queries to filter photos and images based on the items present in the image. There are some popular apps such as Google Photos, OneDrive, etc. which provide this sort of functionality to various degrees but they don’t necessarily let the user enter custom queries to filter images.

In this article, we will look at utilizing TensorFlow.js pre-trained COCO-SSD object detection model to categorize and filter images.

The best thing about using TensorFlow.js is that these apps run completely on the client side and data doesn’t leave the user’s device.

1. Overview

We will build an app that lets the user browse and select a set of images and then filter those images by keying in custom queries with comma separated words referring to the things based on which the images need to be filtered.

In this specific example, since we are using the pre-trained COCO-SSD model, this app will only be capable of identifying the 80 objects that the model is trained on. If the user enters more than one word separated by comma, then the app will treat the filter criteria as OR condition and filter the images if any of those objects are present in the image.

The source code for this app can be found in this Github repo.

Object Detection demo

2. Run the demo

This simple JavaScript app is built using npm package manager. Clone the Github repository to your local and run the following commands.

npm install - to install all the dependencies
npm run dev - this command should invoke parcel bundler to build the app and open it up in a browser window.

3. Implementation

This app consists of three files – index.html, index.js and style.css. Let’s look at each of these files in detail.

3.1 Build the HTML file

Create a new file named index.html. This file basically contains

  • a file input element – for the user to browse and select a list of images
  • an input box – for the user to enter the filter criteria
  • a button – for the user to click to trigger image filtering / categorization
  • a placeholder table – to display the images selected by the user

3.2 Build the JavaScript file

Create a new file named index.js.

3.2.1 Load the required libraries

Import the TensorFlow.js and COCO-SSD packages.

3.2.2 Declare the variables

Declare the variables to hold the references to the HTML elements.

3.2.3 Load the model

Load the COCO-SSD model using cocoSsd.load() method which returns a promise that resolves to the model.

3.2.4 Add a callback for onchange event of file input

When a user browses and selects a list of files, we want to display a preview of those files in the browser using the table placeholder that we defined in the HTML file.

In the displayImages() function, we loop through the list of files selected by the user and create a new image element per image and append it to the table cell created dynamically. Once the images are displayed on the screen, enable the filterImages button.

The clearAllRows() function is invoked prior to displayImages() function, to clear all the images that were selected earlier.

3.2.5 Filter Images

Create an async function as the callback to onclick event of filterImages button. This would be invoked when the user clicks the filterImages button after entering the filter criteria.

First, we retrieve the list of words entered as filter criteria and split them by comma to convert into an array of words. Get all the image elements that we created to display the user selected images using the getElementsByClassName() function.

filterImages.onclick = async () => {

    //Get the filter criteria entered by the user and split them by comma
    const filterConditionArr = filterCriteria.value.split(",").map(item => item.trim());
    console.log("Filter Condition: ", filterConditionArr);

    //Get all the image elements that were created to display the selected images 
    let imgElements = document.getElementsByClassName("imgToFilter");
 
    ...
}

Next, let’s loop through all the image elements using map() method and for each of them invoke the model.detect() method to detect all objects present in those images. Since model.detect() is an asynchronous call, we collect all promises and wait for all of them to be resolved.

filterImages.onclick = async () => {
    ...
 
    //Check if the COCO-SSD model is loaded
    if(model) {
            //Invoke model.detect on each image using an async function and collect the array of Promises returned        
            const promises = Array.prototype.map.call(imgElements, async img => {
            const predictions = await model.detect(img);
            return predictions;
        });

        //Get the o/p array of predictions for all images after waiting for all Promises to fulfill.    
        const predictionsArr = await Promise.all(promises);
        console.log("Predictions Array: ", predictionsArr);
        
	 ...
 
    } else {
        console.log("Model is not loaded yet");
    }
}

Once we have all the predictions for each of the images, let’s loop through the predictions for each image and check if any of the words (filter criteria) entered by the user matches with the objects detected by the model.

Highlight those images that match the filter criteria and dim the brightness for the rest of the images.

filterImages.onclick = async () => {
    ...
 
    if(model) {
        ...
        

        //Loop through the array of predictions (predictions for all images)
        predictionsArr.forEach(function(predictions, index) {
            let matched = false;
            //Loop through the predictions array for a single image
            for(let prediction of predictions) {
                //If any of the filter criteria matches with a detected object class, set matched to true                
                if(filterConditionArr.includes(prediction.class)) {
                    matched = true;
                    break;
                }
            }
 
            if(matched) {
                //Highlight image, if there is a match b/w the filter criteria & the objects present in the image
                console.log("img_" + index + " matched");
                document.getElementById("img_" + index).classList.add("saturate");
                document.getElementById("img_" + index).classList.remove("dim");
            } else {
                //Shade out the images, if the filter criteria didn't match any of the objects present in the image
                console.log("img_" + index + " did not match");
                document.getElementById("img_" + index).classList.add("dim");
                document.getElementById("img_" + index).classList.remove("saturate");
            }
        });
    } else {
        console.log("Model is not loaded yet");
    }
}

3.3 Build the Stylesheet

The style sheet for this app is a simple one. Create a new file named style.css and this file is included in the HTML file.

That’s it for this app. This is just a simple application of object detection use case. This could be utilized in multiple ways – for filtering images, categorizing images, producing metadata for the images, etc.

4. References

  1. Github repo of the COCO-SSD model.

Leave a Comment

Your email address will not be published. Required fields are marked *