Saturday, December 12, 2015

Reading 35: Enhancing KNN Search in Spatial Indexes

Citation:
Cheung, King Lum, and Ada Wai-Chee Fu. "Enhanced nearest neighbour search on the R-tree." ACM SIGMOD Record 27.3 (1998): 16-21.
Summary:
This paper gives an improves set of heuristics for the R* Tree, that makes indexing and retrieving neighbours of any spatial object, much more efficient.

Discussion:
We perform relative indexing primarily by extracting the nearest neighbours of a drawn feature, and checking how much the relative position of the two features match. However, this approach fails drastically when we have insufficient features indexed. In such a case, we have to resort to conventional algorithms that are translation independant.

Reading 34 : 2D Curve Similarity Algorithms

Citation:
Deriche, Rachid, and Olivier Faugeras. "2-D curve matching using high curvature points: application to stereo vision." Pattern Recognition, 1990. Proceedings., 10th International Conference on. Vol. 1. IEEE, 1990.

Summary:
This paper deals with the high curvature points of curves, and uses derivatives to calculate curve maxima.
Discussion:
Our application requires us to be able to gauge the similarity between two freely drawn curves, and we expect a very optimistic approximation of correctness between a template and candidate curve in the case of rivers.

Although the curve detection methods presented here are for a computer vision domain, they can be easily translated to the sketch domain.

Saturday, November 28, 2015

Reading 33 : Using shape context for sketch recognition

Citation:
Belongie, Serge, Jitendra Malik, and Jan Puzicha. "Shape matching and object recognition using shape contexts." Pattern Analysis and Machine Intelligence, IEEE Transactions on 24.4 (2002): 509-522.
Publication Link

Summary:
This paper is very similar to the $p algorithm, in the fact that it tries to match every point in the candidate to some point in the template. We take the set of vectors originating from a point to all other sample points on a shape. These vectors express the configuration of the entire shape relative to the reference point.

Discussion:
The following library has Javascript implementations of Haudroff Distance and Shape Context. This was the motivation behind looking unto Shape Context as an alternative means of template matching.

https://github.com/kjkjava/Sketchy.js/tree/master

 The shape descriptor of a point is represented as a relative distribution over positions, in histogram bins of log-polar space. The cost of this matching for a point is defined using the chi-square statistic:
 

We choose the matching that minimizes the total cost of this matching.



Reading 32 : $P Recognizer

Citation:
Vatavu, Radu-Daniel, Lisa Anthony, and Jacob O. Wobbrock. "Gestures as point clouds: a $ P recognizer for user interface prototypes." Proceedings of the 14th ACM international conference on Multimodal interaction. ACM, 2012.
 
Summary:
The $P algorithm is a gesture matching algorithm that follows in the family of $1 and $N algorithms. It is able to compare a candidate gesture with a predefined template, by treating the gestures as point clouds, while removing the timing information.  This ensures that the algorithm is invariant to multi-stroke data.

Discussion:
The key approach in the algorithm is to match each point in the candidate to some point in the template. This matching is done by using a matching function.  The goodness of the matching is defined using the follow equation:
 

The Hungarian algorithm is used to find the best matching, by treating this candidate-template point matching as an assignment problem. 

The authors feel that this algorithm cannot be classified a $ family algorithm as it is fairly complex and requires us to solve an optimization problem. Some of the heuristics adopted by the authors are: 

(1) Find all the Euclidean distances between points in C and T, and then sort thme, and choose the first n points that can yield a valid match.

(2) Use Greedy heuristics

Reading 31: R Trees for Sketch Representations

Citation:
Guttman, Antonin. R-trees: a dynamic index structure for spatial searching. Vol. 14. No. 2. ACM, 1984.
Summary:
The primary motivation behind this paper in the context of this project is to implement relative spatial indexing for sketches and shapes which have a well defined bounding box.
R-Trees are used to spatially index objects that can be represented in an n-dimensional space using intervals in each dimension. In the simplest form,  each object is represented by a rectangle, i.e bounding box.

Discussion
For most of our implementations, our sketches are well defined within a bounding box and can hence be spatially indexed to build a tree that gives us the relative position of a feature with respect to all the other feature objects that are present.
We first construct an RTree using a standard dataset of geographical features. The parameters to the RTree restrict the maximum number of nodes and hence the split size and depth of the tree.
To query the RTree, we query the RTree to find out which leaf node the given bounding box is in. If the candidate and template bounding box occur the same node of the RTree, then we say that their relative position is the same.


Saturday, November 21, 2015

Reading 30 : Course of Action maps

Citation:
Hammond, Tracy Anne, et al. "A Sketch Recognition System for Recognizing Free-Hand Course of Action Diagrams." IAAI. 2010.
Summary: 
Course of Action diagrams are used by military commanders to plan field operations, with various symbols used to represent troops, supplies, obstacles and movement patterns. An important aspect is the use of domain specific information to tune the recognition algorithm.

Discussion:
Initially, the grouping algorithm detemines which recognizer to send the strokes, PaleoSketch for geometric primitives, and a HWR for text and decision graphics. The segmentation is done by breaking every stroke into a series of lines, and then using Paleosketch.Primitive recognition is done using an extended version of PaleoSketch, which can be combined with LADDER to form more complex shapes. The system also recognizes dashed primitives, decision points (10-line stars), and handwriting. It also uses an improved version of LADDER, CALVIN which uses heuristics to account for confidence values of shape decisions. Some of the other recognised shapes are phase/boundary line, arrows , obstacles and decision graphics.

Reading 29: Hand-Drawn Concept Maps

Citation:
Jiang, Yingying, et al. "Structuring and manipulating hand-drawn concept maps." Proceedings of the 14th international conference on Intelligent user interfaces. ACM, 2009.
 
Summary:
Concept maps are extensively used in visualization. Also, anything drawn by a person on a map is a mental map of some sort. This paper presents a way of recognizing these concept maps on a hand drawn sketch. We believe some of these recognition concepts can also be used in drawings on maps.

Discussion:
The stages in this approach are preprocessing, partitioning the graph into subgraphs, extracting the block of each subgraph using dynamic programming, and finally generating the concept map's structure.The authors also created a set of intelligent gestures that could be used to directly modify the extracted figures. The paper looks similar to the multi-stroke paper in its graph-based approach of merging strokes.