Thursday, December 16, 2010

624 #28 Dixon,Dasarp,Hammond - iCanDraw?

Introduction
This paper presents an assistive feedback system for hand drawing human faces. It is meant to help teach students how to draw a human face and help them see where to make corrections to make their drawing more accurate. The system does this automatically by constructing a drawing template from the reference image that the user has chosen to draw from. The user's drawing strokes are then compared against the template to see if they need correction and feedback.

Discussion
iCanDraw is a nice different way to apply sketch recognition ideas. I personally would like to use some of the design principles that come from this paper to create other such systems, such as for teaching users one and two point perspective drawing.

Monday, December 13, 2010

624 #27 Davis, Colwell, Landay - K-Sketch

Introduction
K-Sketch is an animated sketching system, a tool that aims directly at one of the main purposes and uses of modern pen based interfaces, drawing and animation. The focus of the work is not only to build a tool that makes animation simpler and easier, but also to develop and refine interaction techniques for sketch workspaces. In a user study comparing K-Sketch and PowerPoint in an animation task, it was found that the users were able to learn to use K-Sketch much faster, and generally felt that K-Sketch require much less cognitive load.

Discussion
Frankly it is a little bit surprising that they compared creating animations between K-Sketch and Powerpoint. While powerpoint can do animations, it is well known that Microsoft products usually require a high cognitive load and are not particularly easy to use. Additionally it seems that an actual animation program would be a better comparison, such as Flash or something like it. Still I expect the results would have been very similar, because the controls of K-Sketch seem well thought out and do not overly crowd the interface.

In other news the manipulator tool used in K-Sketch looks very similar to the manipulator tool in Prezi (or vice versa rather).




624 #26 Gabe Johnson - Picturephone

Introduction
Picturephone, as discussed earlier on this blog, is a game developed to build sketch recognition datasets. The game asks multiple players to simultaneously draw a picture based on a text description. This results in the "same" drawing but visually the results are completely different. Generating sketch testing data with this technique allows for a much broader set of images, but at the same time they still depict the same concept.

Discussion

624 #25 Eitz, Hildebrand, Boubekeur, Alexa - Image retrieval based on sketched feature lines

Introduction
This paper describes an image descriptor that allows retrieval of images from a database based on a sketched drawing as input. The descriptor, a Tensor descriptor, is proposed for use in stead of an edge histogram descriptor. The tensor descriptor works by finding the direction of the image gradient in subsections of the image. This descriptor is calculated for every image in the database, then it is calculated for the sketch that is submitted for querying and the descriptors in the database are compared against it. The tensor's performance was better than the edge histogram and subjectively the users of the system preferred the results returned by the tensor.

Discussion
Where can I find one of these? It would be a great way for artists to search for material, for people to be able to look up places or things that they remember visually. It would be interesting to try to visualize your own dreams by sketching out thoughts and seeing what is returned by the search. I hope that Flickr or some other large scale repository picks up on this as a search technique.


624 #24 Gabe Johnson, Ellen Yi-Luen Do - Games for sketch data collection

Introduction
This presentation introduces two games that can be used to collect sketch samples as well as associated metadata or textual description for context. The first is Picturephone, which is very similar to the party game of the same name, except that it is played on a website and players can submit either new descriptions for drawing, new drawings from concepts or rate drawings. The second game is Stellasketch, a game which asks a user to draw an image based on a prompt which other players will then see and use as a clue to figure out the current theme of image prompts. Both games would provide traditional sketch samples, that stroke and timestamp information, as well as ratings and descriptions, all of which are useful elements to data for use in training sketch recognition systems.

Discussion
Systems such as this are some of my favorite topics in computer science. The use of people as "Mechanical Turks" while at the same time entertaining them and providing solid data is a very appealing idea. If you can present people with motivation to help solve problems that are inherently human, while at the same time keeping the task straight forward and engaging you will have no lack of useful data. Imagine if solving puzzles in your favorite game actually solved real world problems at the same time, wouldn't it be ten times more addictive and interesting to play? (I am looking at you PopCap!)

624 #23 Hinckley etc. - InkSeine

Introduction
InkSeine is a sketch input overlay interface that is focused on providing search functionality to support active note taking tasks. The system lets the user mingle ink notes, search queries and documents in one space, acting as a sketch workspace for research, design or creative activities.

Discussion
It is interesting that the main focus of InkSeine is in-situ searching, though it makes sense when their primary user task is note taking and analysis of their personal document collections. I would think personally that users would not want handwritten notes all over their computer workspace, but the idea of having that information available for later access in its original context seems compelling.

624 #22 Mori, Igarashi - Plushie

Introduction
Plushie builds off of Teddy and provides an interface not just to create 3d models but to construct a simulated plush toy. From this simulation the system is able to generate patterns and instructions for assembling the plush toy in real life. Editing can either be performed using techniques of Teddy on the 3d representation, or directly on the 2d construction pattern. While editing, the 3d representation also runs a "plushie" simulation that gives an accurate representation of what the final shape will look like .

Discussion
I did not think I would see anything more brilliant than teddy but this is an amazing combination of natural interaction and physical modeling all working together to simplify an inherently difficult problem. Even professional balloon designers, who were interviewed as part of the user study process for Plushie, felt that the software could help them decrease design time for new balloons.

624 #21 Igarashi, Matsuoka, Tanaka - Teddy

Introduction
Teddy is a system that turns 2D freeform strokes into 3d objects by extracting 2D silhouettes from the sketches. The system is easy to use and it typically took novice users only 10 minutes before they were able to start making 3D shapes. Teddy supports several 3d editing commands, such as creating new objects, painting on the surface, extrusion, cutting, smothing and transformation.

Discussion
Teddy is quite amazing as it really truly takes advantage of sketching as a natural input form to make a task like 3d modeling, which typically requires a long learning curve, simple and quick to learn. The only issue with teddy is that it requires explicit editing modes, but I suppose it is necessary given the complexity of the domain.

Sunday, December 12, 2010

624 #18 Shilman,Viola - Spatial Recognition and grouping of text and grpahics

Introduction
This paper discusses a spatial approach for recognition that is quick and efficient. The strokes of the sketch are connected in a proximity graph, then a classifier determines if the strokes compose part of an already classified shape. The classifier only uses a small subset of the features (image, curvature and endpoints) to increase recognition speed.

Discussion
While this approach is very efficient at achieving recognition for the given set of shapes, it is not as flexible as a geometrically based recognizer. The first issue is that each gesture is limited to 6 strokes, which is fine for gesture recognition but is constraining in an actual sketch environment that is supposed to be free form. Secondly the classifier is based on non-deformable templates so objects like arrows must be drawn as the template specifies and cannot be shaped differently.

624 #17 Bishop, Svensen - Distinguishing text from graphics in on-line handwritten ink

Introduction
This paper presents three methods for analysing text vs shapes in sketches: a multilayer perceptrion neural network (MLP), a hidden markov model (HMM) and a bi-partite HMM. These methods can be layered on top of each other to get a more complete picture of the stroke type. The MLP is the lowest level and attempts to identify strokes by constructing a feature vector of 9 features for each stroke and running them through an MLP to identify their type. The uni-partite HMM combines the individual stroke knowledge of the MLP and combines it with information about the temporal context of the stroke and those that came before it. The intuition with this HMM is that text strokes will follow text strokes and graphical strokes with follow graphical ones. Finally the bi-partite HMM adds information about the spatial context of strokes, how close they are to preceding strokes. In experiments, they found that the addition of the temporal context helped recognition rates for the MLP, but it was not always the case that the spatial context helped.

Discussion
It is interesting that they compared several layers of recognition in this paper, rather than entirely different techniques. Also of interest is the set of features that they chose for their feature vector, though it is not clear why they chose that particular set (perhaps from their own previous work). The combination of static classifiers and
on-line classifiers was particularly interesting as it showed how dependent this method is on training data.

624 #16 Segzin An Efficient Graph based recognizer

Introduction
This paper introduces a graph based system for recognizing symbols. Graph structures are used to represent the primitives that make up a symbol and how they are connected, geometrically and topologically. The recognizer is trained by creating these graph structures for each example sketch, then building average graphs that represent the individual symbol classifications. Four graph comparison algorithms were then compared for use in recognition, a stochastic search, error driven matching, greedy search and geometric sort matching. Stochastic, error driven and greedy search all achieved similar top-1 recognition rates, around 93% with relatively close running time, with stochastic being the slowest and greedy being the quickest. Geometric search was much faster, 2ms compared to the 12ms of greedy or 68ms of stochastic, however its top-1 recognition rate was lower, at 78%, and was aided by drawing consistency between users in the study.

Discussion
I like the search methods presented in this paper, and the fact that they can be applied directly to a symbol recognition system. Such a system lends itself to future improvements in search speeds, and would seem ideally suited for other optimizations like parallelization. This is in contrast with other recognition systems we have been introduced to which have not appeared trivially parallelizeable.