Saturday, September 19, 2015

Reading 6 : Using Entropy to distinguish between text and shapes

Citation:
Bhat, Akshay, and Tracy Hammond. "Using Entropy to Distinguish Shape Versus Text in Hand-Drawn Diagrams." IJCAI. Vol. 9. 2009.

Publication Link


Summary:
While a lot of systems exist that can recognize shapes and text effectively, very few can distinguish when it comes to diagrams containing both text and shapes, such as those in UML charts.  The idea behind this paper is something that comes very intuitively when we look at alphabets. The strokes and points are much more random than most shapes that have some sort of order or sequence. This randomness is characterized mathematically using entropy, and is used to train classifiers that help determine thresholds for shapes and text. One important feature is the use of confidence values which assert how confidently the classifier makes its decision.

Discussion:
Entropy is the expected value of information in a message. Ironically, the more random it is, the more information it is said to have.

The steps in this implementation are :


  1. Form stroke groups using spatial and temporal proximity
  2. Resample strokes to smoothen angles and represent the diagram using equidistant points
  3. Take every 3 successive pair of equidistant points, and map the angle between them to a corresponding entropy alphabet. (Basically, the if angle changes quickly, the alphabet assigned keeps changing too)
  4. Calculate entropy using Shanons formula
  5. Obtain thresholds for the decision
  6. Classifier can decide whether the diagram is Text, Shape or Unclassified using a threshold obtained from training sets.
  7. When thresholding isn't enough, and a decision has to be made, it can be made by specifying a confidence value. 
  8. The thresholds are obtained using a 10-fold validation process.
  9. Some of the results were the decrease in accuracy when classifier was forced to classify a diagram into either category, and the good performance in new domains when using thresholds from other classifier methods.

Limitations:
Dashed Lines (because they are grouped as single stroke)
Dots filled by concentric shading (repeated circular strokes)
Resistor shapes (high degree of randomness)

Most importantly, how does this system deal with alphabets such as O, V , L and T which are inherently going to have a consistent sequence of entropy alphabets?


No comments:

Post a Comment