Tuesday, September 25, 2007

008 - Kim and Kim Curvature Estimation

Summary:

This paper is all about curves and making them as smooth as possible. It begins with some definitions about their equations and some basic mathematic procedure. The next section deals with the basic curvature estimations based mostly on direction change. It goes in depth on how local convexity can be used to add "weight" to a curve and make it much less error-prone. It further goes on to add local monotonicity to the mix. Local monotonicity allows for the convexity to check local points but does not add that convexity unless it's lower than the current minimum value. Ample examples are given to explain this. The paper then shows results of a user study adding each of the elements one at a time, showing how the system gets more accurate and smooth each time a new feature is added, leading to better results.

Discussion:

This paper was slightly confusing the first time through, but it did make sense eventually. The concept of monotonicity really is not a simple one but it is extremely useful. Usually I don't like non-simple solutions but after you understand it it becomes incredibly clear. I think this is probably one of the best curvature estimations that I have seen so far, and really isn't hard to implement, making it my favorite.

007 - Domain-Independent System by Yu

Summary:

This paper deals with Yu et. al basically adding on to Sezgin's model. They begin with a list of attributes that they believe are essential in any sketch recognition software and explains why their system qualifies as such. The next section explains their concept of "feature area", which is somewhat similar to error except it deals with various primitives. They then move on to explain how their system does vertex detection, line and curve approximation, and self-intersections. The vertexes are using basic primitives instead of speed data to approximate points. The line and curve approximations are fairly elementary, with basic examples given by the direction graph of various shapes and how they correspond. The self-intersection is handled by checking to see if it is possible and cutting it if is not. The paper then explains some of the cleanup and how the primitives work with the sketches to test for errors. It then concludes with a user study that ended with decent success on most simple shapes and met with user satisfaction.

Discussion:

To be honest, I loved this paper. It really brings everything down to earth and has simple solutions. Granted, this system has a LOT of errors when you start dealing with more distinct shapes, but for simple tools like AutoCad (which is mentioned at the end of the paper) this system would cut user time in half, which is nothing to scoff at. I think this system is a step in the right direction, even though it seems to just further the work of Sezgin. I believe approximation is a much more beneficial direction than that of perfect recognition.

Saturday, September 15, 2007

006 - Early Processing for Sketch Understanding

Summary:

This paper describes the "Sezgin System" and how it works. It is a system that allows the user to draw a shape any way they want then continuously makes vertexes at specific spots until it gets "close enough" to the desired shape. It also deals with finding and drawing curves accurately. The vertex finding is a complicated description but a simple concept. The paper describes how the system uses the direction data (from the features) to see spikes in changes. This alone, however, doesn't produce correct corners, so Sezgin also includes the speed data from point to point. It finds the average of these two data elements and then collects data on all points above (in the case of direction) and below (in the case of speed) these averages and intersects these values. The resulting intersection becomes the corners. However, if the drawn shape is not "close enough" to the computer-generated shape, the system will try adding both a direction point and a speed point to the data and try again (the points are added separately, giving two different intersections). Whichever is closest is then chosen, and the process repeats until the drawn shape is under the threshold of error. Sezgin then goes on to describe the use of Bezier curves to draw correct curves without only using "circular" curves. It describes how "beautification" works by making slopes of lines slightly more pleasing to the eye (less "jagged-ness" or aliasing). The paper goes on to describe a user study on the system with a set of drawings. The study resulted in mostly positive feedback for the system.

Discussion:

I really like this method. It's pretty "common sense"-like because it is an approximation, which is what people do in real life. You see a slightly oblong circle and you'll just remember it as a circle, unless it has something special about it. I think this method's BIGGEST drawback is also it's biggest asset, and that's the "threshold" issue. People can design different programs for different uses and sets of people with different thresholds, such as allowing for more jerkiness with arthritis patients drawing something and less for skilled draftsmen. But not specifying a default value for this threshold (since Sezgin didn't in his paper) could lead to a lot of problems.

An interesting thought that I had was that he assumes that all corners have a slower speed. Studies have been done to show this is generally true, but these studies were likely all done on a flat horizon surface. What isn't taken into account is people who use boards on the wall or the new "SmartBoards" (they're new enough to me, all right!?). When writing on a vertical surface gravity can change drawing style. For example, drawing the top half of a circle on a piece of paper you will commonly start slow, speed up through the entire arc, and end slightly slower than at the arc. On a board, gravity might slow your hand down near the top of the arc, or rather SPEED it up on the decent from the top, thus resulting in a great change of speed that might be recognized as a corner.

Wednesday, September 12, 2007

005 - MARQS Presentation

Summary:

Paulson's presentation of MARQS included an overview of the program and the possible advantages and disadvantages of its use. MARQS, unlike all previous gesture recognizers, uses only 5 features and a single training example. This is accomplished through seperate single and multiple classifiers that read a specific variation from the gesture made and the gesture base and choose the best match. As more data is accrued for these symbols it will gradually get more accurate, but only 1 starting example is necessary. MARQS also is stroke-number independent, as well as domain independent, allowing users to draw something in many different ways and still have it recognized correctly by the system. Orientation and localization are also independent, allowing for even further variation by the user.

Discussion:

As Brian discussed, while decently effective for such a small amount of training data, some program features are traded for user tolerance. While the system may recognize a stick man with his arms up and a stick man with his arms down both as a stick man, it also recognizes a happy face and a sad face as both faces. As discussed, this could be overcome with multiple layers of classification, though. I think this is a GREAT idea. I love the tolerance in this program, and adding the ability to draw gestures and then classify them is a great designer tool. Sadly, I think the notebook idea is somewhat overkill, because most "notebooks" are not consistent files. The example of drawing some waves of water to go to the "water" notes is nice, but rarely will there be notes all about water in one place. Usually notes are broken up into dates, which are easier to recognize and find by keystrokes, and if they ARE separate the user has already sorted the notes. The only practical use of the system that I could see is if someone were to take an exhorbitant amount of notes and they were classifed into different subjects.

Tuesday, September 11, 2007

004 - Visual Simliarity of Pen Gestures

Summary:

This paper seems to be the premise behind creating Quill as in the aforementioned Long paper. It explains why all the research was done, mainly sticking to the similarities of some gestures that humans may not be able to distinguish. It acts as an overview of all the research done that led to the Quill program. It starts with the examining of different interfaces to design with, different formulas and psychology examples of similarity, and the number of dimensions of a Multi-Dimensional-Scaling technique to use. It then lists the different experiment trials run and what each outcome was. The first trial tested the different features to see which ones were best predictors of similarity of gestures, which turned out to be most of the 11 listed features of Rubine (minus bounding box and total length) and several features that Long et al designed (I think, I could be wrong), and what dimensions they were in. The next trial was over a specified set of gestures to see how people would perceive them (absolute angle and aspect, length and area, and rotation-related features). Trial 1 ended in a seemingly unexplained split in the group, but adding the gesture "sets" in Trial 2 brought the group together more.

Discussion:

I gotta say that Long's results didn't SOUND promising. Even with decent results, I dislike the idea of splitting the users into groups in trial 1, and trial 2 said that even without the outliers the overall usage didn't improve. I applaud Long at trying to REMOVE features, because if you had a program with a million features it would be able to tell exactly how different every gesture was, but no one would want to use it. I do question why they took out the features in Rubine that dealt with time and pretty much removed it from the equation. I can only guess that they wanted to remove the "I draw faster than him" part of similarity, since a circle drawn in a second can look the same as one drawn in a minute, though I think it could still be useful in some cases.

Sunday, September 9, 2007

003 - “Those Look Similar!”

“Those Look Similar!” Issues in Automating Gesture Design Advice

Summary:

Long et al. give a brief overview about gestural recognition in this paper before diving into their new design tool, "Quill". Quill interestingly advises the designer of similar gestures and warns that users may perceive them as the same thing. The paper explains how Quill goes through all the gestures and gives warnings to the designer based on similarity data that the tool creates. It discusses the problems that arose from this "advice" feature, and explains how often and for what reasons they finalized the feature's attributes. It explains several different options for giving advice and which ones were put in the final product.

Discussion:

I think this is a great optional tool for gestural language designers. This seems like a good feature; as programming languages have tools that highlight errors and broken code, this highlights possible mistakes in gestural languages. I can't help but think that we should stress the optional part though, being reminded of Clippy from Microsoft Word. Even though Long and the others went out of their way to be as least annoying to the designer as possible, Quill still has a lot of error in the advice giving.

Personally, I believe that gestural languages should be used for different means. For example, typing will almost always be faster than a pencil/writing tablet for taking lots of words, but throw in a fraction or big equation and sketching is totally the way to go. Basically I'm saying that gestural languages shouldn't be as broad as some applications, because there's only so many shapes you can make before you either get repetitive or just plain confusing.

002 - Rubine's "Specifying Gestures by Example"

Summary:

This paper covers the GRANDMA system, or Gesture Recognizers Automated
in a Novel Direct Manipulation Architecture, which seems to be a very simple gesture-recognizing tool to add any number of gestures to a program, as used with the GDP drawing program. It starts with an example of the system doing some of it's main gestures in a series, including copying, moving, drawing a line, and deletion. Explaining the hierarchy of the GDP, it shows how the gestures are put in and handled. Using a number of examples, it states that almost any gesture can be expressed in this way regardless of size and style. It contrasts this with the multi-stroke techniques that were used in the day, explaining how this improves both techniques and adds flexibility. It then delves into the features used for the underlying algorithm, and how each is calculated and used. It explains the training process and why it works, as well as the rejection and evaluation processes. It further explains some extensions that could be used to improve the overall recognition. This paper is FULL of equations that will be very useful for any recognition system.

Discussion:

I just can't help think that there's some easier way to do this. Obviously much research was done in this area and this is a VERY nice list of features that has been proven to work, but I somehow feel like this is overkill. Maybe there aren't easier ways with less features out there, but I'd like to think there are.

I think this paper is a really great source of information. The list of equations is extensive and pretty laid out bare. I liked the idea of Eager Recognition more than waiting for the system to recognize it, though Rubine doesn't cover it very in-depth.

I commend Rubine for all this research, but I also personally think there's a threshold that we really should stop caring about. Everyone's style is different, so I'm not sure there will ever be a 100% recognition mark for the world's population. Heck, I don't think there'll ever be a 100% recognition mark for our class's population. It's nice to know all this data, but I think, as Rubine said, as long as programs have decent undo functions we shouldn't overwork ourselves to get that 1 extra percent.

001.1 - Personal Info

Ya, I must have missed this part on the homework, so I'm doing it AFTER the first blog entry.

My name is Michael Eskridge, and I'm an Undergraduate Computer Science major graduating in December (hopefully!). I'm mostly interested in "futuristic" research in CS, which I take to be anything that will be useful in a decade or so. I think sketch recognition is right up there on possiblities, though currently I think the field is limited to the constraints that we have in the here and now. I believe that technology should be peripheral-free in the future, with no pens/tablets/wires of any kind, but obviously this is still quite a ways away. But I'm hopeful!

Since I'm only an undergraduate, I don't have a HUGE experience in all types of programming, but I'm usually a fast learner. I have experience with most of the big languages like C++, Java, and C#, but I haven't done much with them in a few years. Most of my recent programming skills are centered in website design like CSS, XHTML, and PHP, which is what I've used for classes for a few years now. Sadly I have NO experience in Sketch Recognition and very little into AI, just a few algorithms I'd have to look up in a book to fully remember.

I got interested in this class from a question posed by my mother-in-law, who teaches 7th grade math. She wanted to know if there was a quick way to put fractions and different mathematical symbols in faster than looking them up in some "insert" tab. I contacted Dr. Hammond about it and somehow wound up in this class. Although I'm not all that interested in the history of the field, I'm convinced that this field along with Computer/Human Interaction will really have a huge impact in the next decade or so, so I'd like to familiarize myself with the coming future by learning what we have available today.

I'm very interested in Computer Graphics and anything to do with them, such as website development, logo design, interface engineering and the like. I'm starting my family soon, as my wife is expecting on Christmas of this year, so in 5 years I hope to have a beautiful daughter and be making lots of money. I don't have any set goals as far as what I do in the CS field, but I'd like to do something that leans towards interface design and development. In ten years I hope to be president of the US and be the first person on Mars. And if you actually read this far, then I commend you and hope you found the joke funny.

I'm an aspiring cartoonist on the side and I play way too many video games. Along with this class I'm taking a 3D modeling course, a storyboarding/comic course, and an introductory course in drawing on the computer (that I probably should be teaching, but I'm not complaining). I graduate in December, like I said, and I hope NEVER TO BE IN SCHOOL AGAIN. As much as some of these classes are fun, school is just not for me. I'd rather be in the workforce and learning than in a classroom any day.

Hopefully this covers what is ME. You'll notice that I spell my name "Miqe" on pretty much everything. This is because in high school someone suggested I be different and spell my name "Mique" and I agreed. Shortly thereafter I typed it in wrong on a message board or game or something and I realized that "Miqe" was much less feminine than "Mique" so I switched it. And I've been Miqe ever since.

So that wraps it up. Hope you enjoyed the read, though you really should have been reading your papers instead....

Miqe