This blog post will cover the current state of an image processing project I've been working on with a focus on content based image retrieval.

Project Goals:

As with most of the projects I post here, the goal is as much about building tools that perform a desired functions as much as they are about hands-on learning. So for this project I will be employing both convention methods such as color moment indexing and more sophisticated methods that use neural network architectures such as VGG16.

The main goals of this project in terms of desired functionality, is to build a toolkit that processes, analyzes and visualizes images in ways that allow me to label and cluster them based on their content and understand trends about the image datasets from a temporal perspective. For example how has the content in the images changed over time? I would like to use these tools for analyzing the many photos I take for photography based projects as well as employ these tools to analyze satellite imagery, since that is an area of geospatial research I'm exploring more as of late.

As of now, the tools I've made for this project consist of the following parts:

Project setup and data tracking
Image pre-processing
Image similarity analysis and clustering
Data visualization

We can begin by going over the project setup and data tracking tools. These some simple scripts written to automate project setup and help track data. Since images will be copied, resized and shuttled to different locations, it's important to have a well-structured system of keeping track of images and any data associated with them.

setup.py

shuttleImages.py

imageStats.py

The imageStats.py script reads over each image in the given directory and calculates color moments of the image. Color moments are a scale and rotation invariant way of describing the color distribution in an image. For an RGB image, color moments are calculated for each of the three color channels. The scale and rotation invariant nature of color moments means the size and rotation of the image does not matter when calculating the similarity between images, instead, the distribution of colors in the image are compared.

Color moments calculated for by the imageStats.py script are as follows:

Zeroth order - Mean

The average color value in the image, calculated by summing all the values in a given color channel, then dividing that sum by the total numbe of pixels in the image.

**First-order - Variance**

The variance measures the spread of colors in the image and provides an idea of how much the distribution of colors deviate from the mean.

**Second-order - Skewness**

Skewness measures the asymmetry of the distribution of values in a given color channel.

**Third-order - Kurtosis:**

Kurtosis measures length of the "tail" of the distribution. Kurtosis provides a more detailed understanding about the shape of the distribution such as whether the distribution is more peaked or more flat.

Color moments are calculated using the calc_color_moments function, which takes an image as it's input and returns a tuple of color moments for each color channel.

The code

The code of the imageStats.py script can be broken up into six main parts.

Part one - Define Functions

First we have the **calc_color_moments** function, which, as mentioned before is responsible for calculating color moments of an input image.
Next we have the **calc_midpoint** function, which is used when it comes time to generate a single point that will represent the image in 3D space. Since we are calculating color moments for each of the three color channels (red, green and blue), we will have three sets of data, each with three color moments. The color moment values are treated as X,Y,Z coordinates and used to generate a point in 3D space to represent the color moments of their respective channel. To generate a single point that represents all the color channels, we will calculate the midpoint between the three points. This function takes the coordinates of the three points as its input and returns a new point as X,Y,Z coordinates as a tuple.
Finally, the **normalize_val** function is used to normalize an input value between zero and one. Since color moments values can have significantly different ranges, we will want to normalize the data before treating the color moments as coordinates, that way we can plot them all in the same 3D space.

Part two - Calculate Color Moments

In this block of code, the script will read each image in the given image directory. Since the images in the given directory have been resized and saved as a new image, the date created and modified data is different from the original image. Since I want to keep the dates of the original image associated with the new, reduced size copy of the image, we can reference the "imageStats_ref.json" file to grab the dates of the original image. Next, the image is passed to the calc_color_moments function and the results are parsed. A new item with a unique image id is added to the imageStats_ dictionary, where we will add the color moment calculations to their respective channels. Notice that there are additional color moment values in the newly made dictionary item with a the "_norm" designation. These values are set to "None" by default, but will be replaced with the normalized value for each color moment later. In order to normalize the data, we need to get the data ranges so we append the repetitive lists in the "poolData_" dictionary with the color moments.

Part three - Calculate Data Ranges

As mentioned previously, we will need to normalize the color moment data prior to using the data as coordinates for plotting points in 3D space. In order to normalize the data between zero and one, we will need to get the data ranges for each color moment by calculating the minimum and maximum values seen across all the images processed. To do this, we can simply use the built-in min and max methos on the lists contained within the poolData_ dictionary.

Part four - Normalize Data

Now that we've calculated color moments and their respective data ranges, we can normalize the data. To do this we will iterate over the imageStats_ dictionary and pass the color moment values into the normalize_val function along with the appropriate minimum and maximum values. The function returns a float between zero and one which then replaces the default None value of the appropriate normalized color moment attribute.

Part five - Write Data

Finally, we can write the results to a json.

Test Run

For this example I'll be using a collection of travel photos I've taken over recent years. The image collection contains a total of 1055 photos taken in various places including, Paris, Spain, Norway, California and Texas. I'll be running the Python scripts on these images and visualizing the results with Rhino and Grasshopper. If the scripts work in the way I'm intending, I should be able to generate a visualization where similar images cluster next to one another in 3d space.

This is the Grasshopper definition used to generate the visualization. The definition is split into four parts.
**01:** Read the image stats JSON with the Grasshopper Python node and parse the results into the outputs (a,b,c,d,e,f,g)
**02:** Get the unique id of each image and visualize the image id at the centroid of where the image will be visualized in 3d space.
**03:** Construct 3d points calculated from the color moments of the image (see the imageStats.py script). At each point, generate a plane which will be used to map the image to. The plane should be oriented vertically along the X/Z axis.
**04:** Render images in 3d by mapping the image to the planes generated in part 03.

3D Image Cloud

Here is what the images look like once rendered in 3D space. It appears model is working, as we can see that images of similar color distributions are plotted in similar places in the cloud of images.

Let’s select some images from different places in the cloud and compare them.

These images are part of the same cluster in the image cloud. I've selected four images, two images taken at the Pompidou center in Paris and two of a full moon taken in Marfa Tx. There are obvious seminaries between the images taken in the same location/of the same subject matter, but we can also identify similarities between the four images even though their subject mater is different. The color schemes of the images is similar, with oranges, reds, blacks and blues making up the dominant colors across the images. The distribution of colors are also similar, most notably because of the circular nature of their distribution. The circular tube-like walkway in the Pompidou center has similar characteristics to the circular nature of the full moon. So mathematically these images would have relatively similar color moments which is why they appear in the same cluster of the image cloud.

Image Feature Extraction - Color Moment Indexing