Automatically recognize patterns in images


Recently I downloaded some flags from the CIA world factbook. Now I want to "classify them.

  1. Get the colors
  2. Get some shapes (stars, moons etc.)

While browsing I came across the Python Image Library which allows me to extract the colors (i.e. for Austria:

#!/usr/bin/env python
import Image
bild ="au-lgflag.gif").convert("RGB")
[(44748, (255, 255, 255)), (452, (236, 145, 146)), (653, (191, 147, 149)), ...)]

What I found strange here is that the austrian flag only has two colors in it, but the above output shows more than ten. Do you know why? My idea was to only count the top 5 colors and as I'm not interested in every color I would do some "normalize" the numbers to multiples of 64 (so (236, 145, 146) becomes (192, 128, 128)).

However at the moment I have no idea what is the best way to extract more information (Ist there a star in the image? or else). Could you give me some hints on how to do it?

Thanks in advance

7/14/2010 10:34:06 PM

Accepted Answer

The Python Imaging Library - PIL just does basic image manipulation - opening, some transforms or filters, and saving to other formats.

Pattern recognition, is part of an advanced image processign field and evolving -- it deos use algorithms far different than those present in PIL.

There are some libraries and frameworks you can use in Python for pattern recognition - (recognising stars, and moons, and so) - Although I advance you: if you want this just to classify one0-hundered-and-a-few coutnry flags, you should do it manually, rather than try to dive in pattern recognition.

Your comment on the number of colors tells that you are not used with computer images at all. And pattern recognition is hardcore, even with a python front-end. (You can't expect any current framework to know beforehand what is a "moon" or a "star" for example)

So, for less than 500 images, you can resort to software that allows you to tag images manually and write some code to link the tags to each flag.

As for the colors: Computer rasterized images are formed of pixels. These are Square. At the boundary between different colors, if a pixel is on one color (say white), and its neighbor is a complete different color (like red), this boundary will show up jagged. This is known as "aliasing". To diminish this, computer software mixes colors at hard boundaries, creating intermediate colors - that is why a PNG even with 2 apparent colors can have several colors internally. For .JPG it is even worse, because the rounded decimal numbers for RGB colors we use are not even stored as they are in the image.

Unlike pattern recognizing, you can downsize the number of colours seen by using just the most significant bits of each component. I'd say the two most significant bits would be enough. The following python function could do that using a color count given by PIL:

def get_main_colors(col_list):
    main_colors = set()
    for index, color in col_list:
        main_colors.add(tuple(component >> 6 for component in color))
    return [tuple(component << 6 for component in color) for color in main_colors]

call it with "get_main_colors(bild.get_colors()) " for example.

Here is another question dealing with the pattern recognition part: python image recognition

5/23/2017 12:02:26 PM

First some quick terminology, just in case:

A classifier learns a map of inputs to outputs. You train a classifier by giving it input/output pairs, for example feature vectors like color information and labels like 'czech flag'. In practice, the labels are represented as scalar numbers. In your example, you have a multi-class problem, which simply means that there are more than two possible labels (obviously, since there are more than two country flags). Training a multi-class classifier can a little trickier than the vanilla binary classifier, so you may want to search for terms like "multi-class classifier" or "one-vs-many classifier" to investigate the best approach for you.

On to the problem:

I think your problem might be easily-solved using a simple classifier, like k-nearest neighbors, with color histograms as feature vectors. In particular, I would use HSV feature vectors as opposed to RGB feature vectors. Some great results have been reported in the literature using just this kind of simple classifier system, for example: SVMs for Histogram-Based Image Classification. In that paper, the authors use a particular classifier known as a Support Vector Machine (SVM) and HSV feature vectors. HSV feature vectors also sidestep the issue of image scale and rotation, for example a flag that is 1024x768 vs 640x480, or a flag that is rotated in an image by 45 degrees.

The pseudocode for training the algorithm would look something like this:

# training simple kNN -- just compute feature vectors, collect labels
X = []    # tuple (input example, label)
for training_image in data:
    x = get_hsv_vector(training_image)
    y = get_label(training_image)

# classification -- pick k closest feature vectors 
K = 3     # the 'k' in kNN -- how many similar featvecs to use
d = []    # (distance, label) tuples for scoring
x_test = get_hsv_vector(test_image)    # feature vector to be classified
for x_train in X:
    d.append((distance(x_test[0], x_train), x_test[1])

# sort distances, d, by closeness and pick top K labels for scoring
output = get_majority_vote([x[1] for x in d[:K]])

The kNN classifier is available in several python packages, with good documentation. It should be pretty easy to convert to HSV colorspace as well. If you don't achieve your desired results, you can try to improve your feature vectors or your classifier.

Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow