Wednesday, December 12, 2007

22 - What are intelligence? And why?

Summary:

Davis begins this rather informative paper by explaining the title (English use of plural verbs) and how there are many different thoughts of what "are" intelligence. He explains the different views of reasoning, including mathematical, psychological, biological, statistical, and economical. He then goes on to explain different views of how intelligence has evolved and why it has evolved, using various views on early humans hunting skills, socializing skills, etc. The next section discusses less "human" variations on intelligence, showing different animals that have proven to be much more intelligent than most of their kind (or at least a human perception of their kind).

Discussion:

While this paper has little to do with sketch recognition, it is a very interesting read. Davis provides a great deal of information while keeping a neutral stance. When questioned in an interview, he specifically tried not to single any idea out as the best, worst, or even most in need of further consideration.

What I like about this paper is that it explains that if humans ever expect to develop a "strong AI" machine, we cannot continue to focus on things like rational, statistical logic. Intelligence is a product of evolution, shown by the lack of symmetry in the human brain, and a slow and steady process at that. If human scientists intend to create a human-like computer brain, they are going to have to cover thousands and thousands of years worth of history, let alone what little we know of before history was written.

Personally, I didn't believe that a true "strong AI" unit would ever be produced. We will have computers that think 1000 times faster and more efficient than humans, be able to make decisions in practically any situation, and perhaps even be superior to humans in almost every way, but it will never be able to emulate the random quirks that exist in every human. What I was intrigued and somewhat baffled by, however, is Davis' inclusion of "biology" in the reasoning list. I think this will be the defining step in AI, and the major hurdle and seemingly unscale-able wall that seperates "weak AI" from the real "strong AI". This will be the small flickers of random and illogical thought that happen to humans, the unexplainable phenomenon that separate humans and machines at the present.

21 - SketchREAD

Summary:

Alvarado's SketchREAD is a system based on Bayesian networks to cut down on the time-consuming task of checking every line with every template match. The system uses both a top-down method and a bottom-up approach. The top-down comes from unfinished shapes, and looks for the missing pieces of possible template matches. It will do this continuously if necessary. Bottom-up is similar to LADDER's approach, where each drawing is recognized as a primitive and the drawing builds upon itself. The system will also "prune" interpretations from the top-down approach, finally selecting the best fit after removing all the unlikely ones. Results show that SketchREAD improves upon baseline numbers, very highly in trees and significantly in circuit diagrams as well.

Discussion:

I wish I could test this system out. This system seems like a very nice alternative to LADDER and definitely in the running. I believe the different methods used within SketchREAD should definitely be looked at more closely, as using both top-down and bottom-up recognition techniques seems to be a great asset. I think integrating these techniques into existing systems would almost always increase recognition, but without knowing all the domains that exist, it's not certain. Still, I'd like to see more domains tested with these innovative ideas.

20 - 1$ Recognizer

Summary:

Wobbrock et al explain their new recognizer that can be written with little mathematical background in this system. It's given in four easy to follow steps - resampling, rotation, scaling, and classification. It begins by taking N samples from the gesture, defined by the system designer, then rotates it to be horizontal from the start point to the middle of the gesture (N/2 point). Then the gesture is scaled to fit in a specified bounding box and translated to be at the origin. The gesture is then error-checked by the system to match to the list of existing shapes. Results are very high (around 98%) with simple shapes and a decent amount of existing templates.

Discussion:

While this recognizer seems to have no significance in furthering the field, I believe it is necessary to the advancement into the next evolution of sketch recognition. The real benefit of this system is not in how much it advances the field, but more in how it draws new people into the field. This system is meant as a "first step" into sketch recognition, which was, before now, Rubine's feature system. As a beginner in the field myself, I see this paper as a much prettier and "fun" approach of entering sketch recognition, as Rubine's method is much more intimidating than this sytem. I believe that this system will be useful in drawing fresh ideas from less mathematically inclined designers.

19 - Multiscale Models of Temporal Patterns

Summary:

Sezgin et al write about how temporal patterns in user drawing process can be used to determine recognition. The paper discusses the different equations that can be used to find patterns, as well as how Hidden Markov Models can be used coinciding with these equations to find possible recognition rates. Using both gives a decently accurate sample of what the user intends to draw, except for the 'transistor' sketch on a circuit diagram. This is usually caused by segmentation with a wire being drawn after the main part of the transistor and before the "rebounded" part of the transistor. Sezgin et al compensate for that, looking further in the drawing queue for similar examples, basically translating the wire after the "rebounded" part. Results are quite impressive, usually in the 80-90% correct recognition.

Discussion:

While I believe there is a future in this way of thinking, I'm not sure about the exact direction. The paper seems to rely heavily on time data, which is not a TERRIBLE idea but can lead to a dependent relationship between the computer and specific users. The idea that people draw things in a specific order is great in theory, but sketches are usually used in a design process, and when designers design something "new" they usually don't always think linearly. For example, and architect usually draws an outlying structure of a house before putting in the interior walls, to make sure everything fits. But when he gets free reign to build something from scratch, rooms will be continuously added to an existing building, changing the outer wall again and again. I don't think these problems would make the system completely useless, but more work would be needed to keep a domain-independent status.