Citation:
Kara, Levent Burak, and Thomas F. Stahovich. "An image-based, trainable symbol recognizer for hand-drawn sketches." Computers & Graphics 29.4 (2005): 501-517.
Publication Link
Summary:
This paper discusses a technique developed to match sketches to a standard set of defined templates that are learnt from single prototype examples. Such a recognition tool has a vast set of applications. The main contributions of this paper are the use of polar transforms for preprocessing and filtering, as well as the use of multiple classifiers to achieve high recognition accuracy. The results were finally evaluated by performing a series of tests, where it was found that high accuracy was achieved even with very little training examples.
Discussion:
The main takeaway from this paper are the stages involved in attaining a match for a sketch.
The stages are:
Kara, Levent Burak, and Thomas F. Stahovich. "An image-based, trainable symbol recognizer for hand-drawn sketches." Computers & Graphics 29.4 (2005): 501-517.
Publication Link
Summary:
This paper discusses a technique developed to match sketches to a standard set of defined templates that are learnt from single prototype examples. Such a recognition tool has a vast set of applications. The main contributions of this paper are the use of polar transforms for preprocessing and filtering, as well as the use of multiple classifiers to achieve high recognition accuracy. The results were finally evaluated by performing a series of tests, where it was found that high accuracy was achieved even with very little training examples.
Discussion:
The main takeaway from this paper are the stages involved in attaining a match for a sketch.
The stages are:
- Preprocessing: Done by producing a rasterized image in 48 x 48 grid to reduce amount of data while retaining the characteristics of an image
- Template-Matching: Its a featureless approach (using purely geometry of figure) to characterize similarity. It is done by comptuing Hausdoroff distance, Tanimoto coefficient and Yule's Coefficeint.
- Hausdoroff distance is susceptible to outliers, hence we often choose MHD or partial(kth) Hausdoroff distance.
- There are Tanimoto coefficients defined to measure the coincidence of white pixels, as well as that of black pixels. They are combined using a constant which determines whether the figure is biased towards black or white pixels.
- As figures always suffer from misalignmens, thresholded matching criteria is used that relaxes the condition for two pixels to be considered as overlapping.
- In a distance transform, each pixel encodes its distance from the nearest black pixel (a kind of nearest neighbour function). It basically allows offline computation during pre-processing, to enable to processing to be simply a look up operation.
- Classifiers are combined by eliminating the range difference using normalization and by standardizing the output.
- Rotations are handled by conversion to polar coordinates. The weighted centroid is often chosen as the origin. A weighting function is used to influence data near the centroid as it can be too sensitive to changes in the centroid. The MHD is then used to find the most similar alignments of the figure.
- The polar transform phase also functions as a filtering phase prior to recognition, as it is immune to false negatives, and manages to remove 90% of possible definitions before sending it forth to the processing stage.
- The tests covered multiple defintions of symbols as well user dependencies to unravel all possible performance constraints.
Limitations
- While this system is rotation, translation and scale invariant, it degrades when non-uniform scaling is introduced, and is unable to tell apart from similar figures that have slightly varying dimensions
- The method of sampling tends to be lossy in some cases where the figure has a lot of detail
No comments:
Post a Comment