Recognizing Sketch Recognition: October 2015

Saturday, October 24, 2015

Reading 21 : Ladder

Citation:
Hammond, Tracy, and Randall Davis. "LADDER, a sketching language for user interface developers." Computers & Graphics 29.4 (2005): 518-532.
Publication Link

Summary:
LADDER is a high level language that allows developers to write sketch definitions for complex shapes, that are comprised of primitives. The language can be used to model the components of a shape, and enforce some constraints on how these components are position with respect to each other. Other important features the languages allows are editing and display commands, which determine how a shape should react when edited, or how it should be displayed upon recognition.

Discussion:
The shape definition comprises of components, constraints, aliases, and editing and display behaviours. Shapes can be defined using shape groups, hierarchical shapes and abstract shapes. There is also an additional constraint available that allows recognition to be orientation specific.

Two import rules used are:
1. Jess Rule - All possible combinations of shapes are checked
2. Constraint Satisfaction Rule - The systems tries to satisfy all the given constraints while moving the points as less as possible

Reading 20: Recognizig multi-stroke primitives

Citation:

Hammond, Tracy, and Brandon Paulson. "Recognizing sketched multistroke primitives." ACM Transactions on Interactive Intelligent Systems (TiiS) 1.1 (2011): 4

Publication Link

Summary:

Most research in primitive recognition often assumes that a basic shape such as lines and polygons are are drawn with a single stroke. If multi strokes are allowed, the user is forced to draw under certain set of limitations. This paper seeks to eliminate, by giving the user full independence to complete one shape in multiple strokes, or multiple shapes in a single stroke. The main stages in this multi stroke recognition system are, graph building, identification of strongly connected components, merging components and removing false positives.

Discussion:

By analyzing the sketches of a set of users over a range of shapes, it was found that there were a lot differences in the styles of drawing, especially as far as number of strokes in a shape are concerned. Many people tend to make use of touch up strokes and continuity strokes, even while drawing a simple figure.

1. Use the stroke information to form a graph. The endpoints of strokes form th nodes, while an edge between two nodes occurs if either of the three conditions are satisifed:
-There is a stroke between the two points
- The gap between the two points is within a threshold of the average stroke length of the strokes passing through the end point
- The gap is smaller than the average width of the bounding box of the two strokes.

2. Tarjan's algorithm is used to extract the strongly connected components in this graph.

3. The merging of nodes identified done in the same component is done by removing peaks at the end points, and finding the point in the two strokes closest to each other, and creating a touch up stroke between them.

4. The final step is remove false positives.

Reading 19 : Sketch Recognition using Sounds

Citation:

Li, Wenzhe, and Tracy Anne Hammond. "Recognizing text through sound alone." Twenty-Fifth AAAI Conference on Artificial Intelligence. 2011.

Publication Link

Summary:

This paper makes use of the classical idea of capturing the sound information while making strokes, to recognize the stroke. The sound analysis is performed using multiple features extracted from the input signal of the sound. The stages in this recognition system are noise removal, feature extraction, template matching and dynamic time warping.

Discussion:

1. End Point Noise Removal:

Two techniques are used to remove end point noise. In the first technique, first the environmental noise is obtained as the signal energy for noise amplitudes over the first 150ms, after which the amplitudes are sample at every 45 ms, and checked against a threshold multiplied by the environmental noise, to determine the start and end points. The second method assumes the first 20ms as environmental noise and uses it to calculate a Gaussian probability. Every 10ms segments is checked, and marked as silent or valid. Finally, all the valid segments are merged. The mean amplitud value is used to normalize the signal.

2. Feature Extraction. Two main features are extracted, mean amplitude for each time frame, and mel-frequency cepstral coefficients.

3. Template matching: This is done using dynamic time warping. The DTW constraints defined are end point constraints, local continuity constraints, global path constraint and axis orientation.

Sunday, October 18, 2015

Reading 18: Geometric Basics

Citation:
Dr. Tracy Hammond

Summary:
This handout primarily covers some of the geometric basics required to have a good understanding of sketch recognition systems.

This includes fundemental trignometric rules, linear algebra, vector algebra.

Reading 17 : PaleoSketch

Citation:

Paulson, Brandon, and Tracy Hammond. "PaleoSketch: accurate primitive sketch recognition and beautification." Proceedings of the 13th international conference on Intelligent user interfaces. ACM, 2008.

Publication Link

Summary:

Paleosketch is one of the landmark techniques used for primitive shape recognition, i.e after early processing of corners is completed. Many current applications make use of Paleosketch as the technique to obtain individual shapes before passing it to higher level recognizers to obtain further groupings.

Discussion:

The pre-processing involves a few stages, such as computing corners (Using a three step process of selecting consecutive corners, merging similar corners in a neighborhood and finally selecting the highest curvature point in a neighborhood as the corner), plotting direction, curvature and speed graphs and using them to compute NDDE and DCR. We also detect closed shapes, overtracing and tails at this time.

Next, the strokes are passed to a series of shape detectors. Each shape detector subjects the stroke to a set of geometric constraints, returns the shape if all the constraints are passed. If the stroke does not pass any test, it is recursively broken down at the point of highest curvature. Later, the detector also tries to recombine primitive shapes by passing it to a secondary function (as dividing at point of highest curvature may not necessarily give individual shapes all the time)

The geometric tests are designed for the following shapes:

1. Line

2. Polyline

3. Circle

4. Ellipse

5. Curve

6. Spiral

7. Helix

(Refer paper for the exact details of these geometric constraints)

Finally, to select the final set of shapes the defines a stroke, a hierarchical ordering is used that tries to minimize the number of lines returned (i.e it favors arcs over lines, and complex shapes over heavily segmented polylines)

Reading 16 : Combining Corner Detectors

Citation:

Wolin, Aaron, Martin Field, and Tracy Hammond. "Combining corners from multiple segmenters." Proceedings of the Eighth Eurographics Symposium on Sketch-Based Interfaces and Modeling. ACM, 2011.

Publication Link

Summary:
This paper aims to consolidate all the previous attempts in the domain of corner-detection, and integrate all these algorithms to build a powerful corner detector. The accuracy of this detector was proven to surpass that of any individual detector. However, running this corner finding algorithm on the entire set of points proved to fail drastically, as there are sharp spikes in the error metric which prevent effective thresholding.

Discussion:

The first step is to run each of the 5 corner detectors individually, (Sezgin, Yu and Cai, ShortStraw, Douglas-Pecker, and Kim and Kim), and the union of all the obtained corners is taken, with duplicates removed.

Next, the a Sequential Backward Floating Selecion(SBFS) technique is used to remove corners one by one from this combined set. At every iteration, a removed corner can be added back if its found that the error decreases. The corner to be removed is determined as the one that causes the least error upon removal.

The error metric used in this case is Mean Squared Error, which is the mean distance between every point and the vertically closest point on the polyline formed by the remaining corners.

The corner removing stops when there a sharp spike in M.S.E (an elbow in the curve), whose slope exceeds that of a threshold. The remianing set of corners are returned as the final output.

To determine the threshold, we use a training set where the corners are already labeled. Next, the we take the medians of the slope before the elbow, and the slope after the elbow for these shapes, and construct the Gaussian distribution for the 2 sets. The points where the probability of these two models are the same, is taken as the confusion point, i.e the threshold.

Sunday, October 11, 2015

Reading 15 : IStraw

Citation:

Xiong, Yiyan, and Joseph J. LaViola Jr. "Revisiting ShortStraw: improving corner finding in sketch-based interfaces." Proceedings of the 6th Eurographics Symposium on Sketch-Based Interfaces and Modeling. ACM, 2009.

Publication Link

Summary:

This paper first reviews the Shortstraw algorithm, and then identifies its limitations and suggestions improvements. It also introduces curve detection, and provides experimental proof for the massive improvement obtained.

Discussion:

We mainly need to discuss the 6 limitations that occur in ShortStraw, and how iStraw overcomes them.

1. Initial Straws

Problem : Straw fails to compute the straw for the first 3 and last 3 points, which can turn out to miss corners in some cases.

Solution : iStraw introduces formalae to compute the straws for these points too.

2. Timing information

Problem : Shortstraw does not make any use of the stroke speed, which can actually be helpful in finding corners

Solution. Use speed maxima to identify corners.

3. Consecutive False Corners Avoidance

Problem: Shortstraw tends to delete some true corners, after which the colinearity test fails drastically.

Solution: Use a two-pass co-linearity test, with second Dpass having more relaxed thresholds.

4. Dynamic Threshold

Problem : Shortstraw does not take into account the length of line segments, while determining corners. This is an important distinguishing factor is some cases.

Solution : In the second pass of colinearity test, a dynamic threshold based on length of the segment, is introduced.

5. Impact of Sharp noise

6. iStraw introduces curve detection by treating every prospective corner as a point where two rays coverage, both originating from a fixed shift distance from the point in question. For a line, the angle formed a this point remains fairly constant, while it changes drastically for a curve.

Reading 14 : ShortStraw for Corner Detection

Citation:
Wolin, Aaron, Brian Eoff, and Tracy Hammond. "ShortStraw: A Simple and Effective Corner Finder for Polylines." SBM. 2008.

Publication Link: http://www.researchgate.net/profile/Tracy_Hammond/publication/220772398_ShortStraw_A_Simple_and_Effective_Corner_Finder_for_Polylines/links/0deec529f76e58d523000000.pdf

Summary:
This paper presents an alternative algorithm for corner detection, where the earlier work has been done by Sezgin, Yu and Cai. The main phases of this algorithm are Resampling, Bottom up Phase and Top Down Phase. The overall result was a massive improvement over the previous two techniques, in both corner accuracy as well as all-or-nothing accuracy. The biggest drawback of this algorithm however, is its failure to detect obtuse corners.

Discussion:
The main takeaways here are the 3 phases of the Shortstraw algorithm:

1. Resampling
The first step is to obtain a resampled representation by using an interspacing distance.

2. Bottom-up

A straw for a point is computed as the distance between the endpoints of a window containing the point.(window size is generally 3 on either side of the point). The points having short straw length are determined as corners, using a threshold of of (median * 0.95)

3. Top-Down
(i) First, it checks to see if every segment between two consecutive corners passes a line test (chord distance/path distance)

(ii) Removing false negatives: If any two corners fail the line test, it is assumed that there must a missing corner, approximately between them, and the threshold is relaxed to find it.

(ii)Removing false poistives: Checking for co-linearity among triplets of consecutive corners.

Reading 13 : Domain Independant Sketch Recognition

Citation:

Yu, Bo, and Shijie Cai. "A domain-independent system for sketch recognition." Proceedings of the 1st international conference on Computer graphics and interactive techniques in Australasia and South East Asia. ACM, 2003.

Publication Link : http://dl.acm.org/citation.cfm?id=604499

Summary:

The main approach used in this paper is to combine primitive shape recognition with the early processing (as opposed to having a separate pre-processing stage as with Sezgin). The vertex detection phase is an implicit part of the stroke detection. The paper then explains its line segment approximation and curve approximation algorithms, along with some of the post processing features. The concept used for determining the fit of strokes against primitives is called 'feature area verification'.

Discussion:
The main takeaways from this paper are the approaches to detect lines, circles, arcs and curves.

Line Segment Approximation:

Try to fit a best fit line, and compute the feature area using this candidate line. The stroke area is taken to the length of the stroke into its thickness. If the ratio of the feature area to stroke area is less then 1, the stroke is recorded as a line

Circle Approximation:

Check if the direction graph of the stroke can be approximated as a circle. If so, take a candidate circle that takes the center of the stroke's bounding box, with radius as mean of distances between the center and the stroke point. Feature area is then applied to try to fit the circle. If the direction graph extends beyond 2pi, then it is checked for over-tracing and helical strokes.

Arc Approximation:

1. Check if direction curve can be approx. using line

2. Find the perpendicular bisector of arc, and find the find the circle on which the end points, and this intersection points lie.

3. Follow the same procedure as for circles

Handling Self-Intersection Strokes:

Uses two heuristics and compares results

(i)Simple is better

(ii) Specific is better

Finally, the paper also covers some of the post processing routines, including:

-Elimination of false and redundant elements introduced by noise

-Adjust layout of elements

-Basic object recognition