EidolonSpeak.com ~ Artificial Intelligence

Google | Wikipedia |
About |
Dialogue Space | Letters
subglobal4 link | subglobal4 link | subglobal4 link | subglobal4 link | subglobal4 link | subglobal4 link | subglobal4 link
subglobal5 link | subglobal5 link | subglobal5 link | subglobal5 link | subglobal5 link | subglobal5 link | subglobal5 link
subglobal6 link | subglobal6 link | subglobal6 link | subglobal6 link | subglobal6 link | subglobal6 link | subglobal6 link
subglobal7 link | subglobal7 link | subglobal7 link | subglobal7 link | subglobal7 link | subglobal7 link | subglobal7 link
subglobal8 link | subglobal8 link | subglobal8 link | subglobal8 link | subglobal8 link | subglobal8 link | subglobal8 link

AI & Artificial Cognitive Systems

small logo

Voices from the AI/CSE Community

Voices from the AI/CSE Community:

 

References and Endnotes

 

“Consider homogenous coordinate transformations (in computer graphics), where the linear nature of the primitive operations (scaling, rotation, and translation) allows any sequence of them to be "compiled" into a single matrix multiplication. The field of AI has not, to date, produced any compiling methods which can rival this speedup, because most interesting AI problems are nonlinear and most interesting AI representations are not numeric. The point is that given suitable representations, efficient non-linear mapping engines could generate significant speed improvements for inferential processing…Here is a conundrum for theories of human and machine learning: Which came first, the mental procedure or the mental representation? Minsky and Papert claimed that the representational egg must come before the procedural chicken, while Fodor and Pylyshyn claimed to intimately know the egg and, by extension, the exclusive class of fertile chickens. The flip side, of course, is that this perfect egg may only be layable by an impossible chicken: A formal representational theory, specified without consideration of its own genesis, may not be learnable by any mechanism in principle. This…points to biologically certified way out of the dilemma: Co-Evolution. The representations and their associated procedures develop slowly, responding to each other’s constraints through a changing environment…given a sufficiently powerful form of learning, a machine can learn to efficiently perform a task by example, rather than by design. Taken together, these ideas suggest that, given a task, specified by example, which requires embedded representations, a network might be able to develop these representations itself…Currently, symbolic systems use information-free "atoms" which physically combine (through bit or pointer concatenation) in a completely unrestricted fashion. Thus, for any domain, a syntax is required to restrict those "molecules" after the fact, to the set of semantically interpretable ones. With further work, recursive distributed representations might undergo a metamorphism into symbols which contain their own meanings and physically combine only in a systematic fashion. After all, real atoms and molecules do so all the time.” –J.B. Pollack, [13]

 

“A major problem for any cognitive system is the capacity for, and the induction of the potentially infinite structures implicated in faculties such as human language and memory. Classical cognitive architectures handle this problem through finite but recursive sets of rules, such as formal grammars (Chomsky, 1957). Connectionist architectures, while yielding intriguing insights into fault-tolerance and machine learning, have, thus far, not handled such productive systems in an adequate fashion.

So, it is not surprising that one of the main attacks on connectionism, especially on its application to language processing models, has been on the adequacy of such systems to deal with apparently rule-based behaviors and systematicity. I had earlier discussed precisely these challenges for connectionism, calling them the generative capacity problem for language, and the representational adequacy problem for data structures. These problems are actually intimately related, as the capacity to recognize or generate novel language relies on the ability to represent the underlying concept. A fixed-width representation of variable-sized symbolic trees leads immediately to the implication that simple forms of neural-network associative memories may be able to perform inferences of a type that are thought to require complex machinery such as variable binding and unification. But when we take seriously the infinite part of the representational adequacy problem, we are lead into a strange intellectual area. It is my working hypothesis that alternative activation functions (i.e. other than the usual sigmoidal or threshold), based on fractal or chaotic mathematics, is the critical missing link between neural networks and infinite capacity systems. The bifurcation between structure and form which leads to the near universality of discrete symbolic structures with ascribed meanings has lead to a yawning gap between cognitive and perceptual subareas of AI. This gulf can be seen between such fields as speech recognition and language comprehension, early versus late vision, and robotics versus planning. The low-level tasks require numeric, sensory representations, while the high-level ones require compositional symbolic representations (It is no surprise then, that neural networks are much more successful at the former tasks.) The idea of infinitely regressing symbolic representations which bottom-out at perception has been an unimplementable folk idea ("Turtles all the way down") in AI for quite some time. The reason for its lack of luster is that the amount of information in such a structure is considered combinatorially explosive. Unless, of course, one considers self-similarity to be an information-limiting construction. The implications for inductive inference is that while, formally, push-down automata and Turing machines are necessary for recognizing harder classes of languages, such as context-free or context-sensitive, respectively, the idiosyncratic state-table and external memory of such devices make them impossible to induce. On the other hand, chaotic dynamical systems look much like automata, and should be about as hard to induce. The infinite memory is internal to the state vector, and the finite-state-control is built into a more regular, but non-linear, function.” –J.B. Pollack, [14]

 

“The proponents of symbolism picture the human brain as a digital processor of symbolic information, and argue that computational models of the brain should be based on algorithmic programs manipulating symbols. This is the traditional school of thought, antagonist to connectionism, which denies the validity of connectionist models altogether and doesn't credit them with any explanatory potential. A halfway position between these two opposite views is that of implementational connectionists (e.g. Fodor, Pinker and Pylyshyn), who admit the utility of ANNs in modeling cognitive processes, but hold that they should be employed ultimately to implement symbolic processing (“the mind is a neural net; but it is also a symbolic processor at a higher and more abstract level of description”). According to them, research of models should be made at the symbolic (psychological) level, whereas ANNs are the tools through which these models are implemented in practice.

Fodor raised a well-known argument against the adequacy of connectionism as a model of the mind, based on a characteristic of human intelligence, which he called systematicity. Neural networks, he said, are good at capturing associations, but they alone cannot account for higher cognitive abilities required, for instance, for human language. Still another main criticism against connectionist models of language is based on the compositionality of language (the meaning of a complex statement can be decomposed in terms of the individual meanings of its simpler constituents). As if to contest this challenge launched against connectionism about the recursive nature of language, Pollack devised a neural network architecture that was well-suited to represent recursive data structures, such as trees and lists: the recursive auto-associative memory (RAAM). Due to their ability to represent recursive data structures, RAAM networks are useful for working with syntactic and semantic representations in NLP applications.” –J. Poveda and A. Vellido [15]

 

“There has been much debate about whether our perceptual abilities should be attributed to a few million generations of blind evolution or to a few hundred million seconds of visual experience. Evolutionary search suffers from an information bottle- neck because fitness is a scalar, so my bet is that the main contribution of evolution was to endow us with a learning algorithm that could make use of high-dimensional gradient vectors. These vectors provide millions of bits of information every second thus allowing us to perform a much larger search in one lifetime than evolution could perform in our entire evolutionary history.

So what is this magic learning algorithm? I have been involved in attempts to answer this question using undirected graphical models [Boltzmann machines], directed graphical models [Deep belief networks], or no graphical model at all. These attempts have failed as scientific theories of how the brain learns because they simply do not work well enough.” –G.E. Hinton, What kind of a graphical model is the brain? [16]

 

“There are several reasons for believing that our visual systems contain multilayer generative models in which top-down connections can be used to generate low-level features of images from high-level representations, and bottom-up connections can be used to infer the high-level representations that would have generated an observed set of low-level features. Single cell recordings and the reciprocal connectivity between cortical areas both suggest a hierarchy of progressively more complex features in which each layer can influence the layer below. Vivid visual imagery, dreaming, and the disambiguating effect of context on the interpretation of local image regions also suggest that the visual system can perform top-down generation.” –G.E. Hinton [17]

 

“One long term goal of Machine Learning (at least for some of us in the ML community) is to devise sufficiently general and powerful learning methods that can learn an entire recognition task from end to end with a minimal amount of labeled samples. We hinted at the fact that currently popular ‘shallow’ models, such as kernel methods, fall short. If we want to scale the applicability of learning methods to AI-like tasks, we must concentrate our effort on solving the still-unsolved ‘deep learning problem’.” –Y. LeCun et al., [18]

 

“Lower level abstractions are more directly tied to particular percepts, whereas higher level ones are what we call “more abstract” because their connection to actual percepts is more remote, and through other, intermediate-level abstractions…The focus of deep architecture learning is to automatically discover such abstractions, from the lowest level features to the highest level concepts. Ideally, we would like learning algorithms that enable this discovery with as little human effort as possible.” –Y. Bengio [19]

 

“The neural network of the brain is a system that works under several constraints. The cost-minimizing principle, by which the second boom of the neural network research was triggered, is one of such constraints, but there must be many other important constraints. We now have come to the stage to uncover new principles that control the neural networks of the brain. It is a very attractive research area, that lead to, not only explication of the mechanism of the brain, but also invention of design principles of intelligent information processing systems of the next generation…The relationship between modeling neural networks and neurophysiology resembles that between theoretical physics and experimental physics.” –K. Fukushima [20]

 

References and Endnotes:

 

[13] “Recursive Distributed Representations,” J.B. Pollack, Artificial Intelligence, vol. 46 (1), p.77, Nov. 1990.

 

[14] “Implications of Recursive Distributed Representations,” J.B. Pollack, Advances in Neural Information Processing Systems I, 1989.

 

[15] “Neural Network Models for Language Acquisition: A Brief Survey,” J. Poveda and A. Vellido, Lecture Notes in Computer Science, vol. 4224, p. 1346, 2006.

 

[16] “Learning Multiple Layers of Representation,” G.E. Hinton, Trends in Cognitive Sciences, vol. 11, p.428, 2007.

 

[17] “What Kind of Graphical Model is the Brain?” G.E. Hinton, International Joint Conference on Artificial Intelligence, 2005.

 

[18] “Energy-Based Models in Document Recognition and Computer Vision,” Y. LeCun et al., Ninth Intl. Conf. on Document Analysis and Recognition, 2007.

 

[19] “Learning Deep Architectures for AI,” Y. Bengio, Foundations and Trends in Machine Learning, vol. 2 (1), 2009.

 

[20] “Modeling Neural Networks for Artificial Vision,” K. Fukushima, Neural Information Processing, vol. 10, 2006.

About | Disclaimer | RSS Feed | | ©2012 EidolonSpeak.com