EidolonSpeak.com ~ Artificial Intelligence

Google | Wikipedia |
About |
Dialogue Space | Letters
subglobal4 link | subglobal4 link | subglobal4 link | subglobal4 link | subglobal4 link | subglobal4 link | subglobal4 link
subglobal5 link | subglobal5 link | subglobal5 link | subglobal5 link | subglobal5 link | subglobal5 link | subglobal5 link
subglobal6 link | subglobal6 link | subglobal6 link | subglobal6 link | subglobal6 link | subglobal6 link | subglobal6 link
subglobal7 link | subglobal7 link | subglobal7 link | subglobal7 link | subglobal7 link | subglobal7 link | subglobal7 link
subglobal8 link | subglobal8 link | subglobal8 link | subglobal8 link | subglobal8 link | subglobal8 link | subglobal8 link

AI & Artificial Cognitive Systems

small logo

The State of AI, Part 4: Intelligent Agents, Robotics and Brain-Machine Interfaces

The State of AI: Intelligent Agents, Robotics and Brain-Machine Interfaces

 

The state of artificial intelligence, cognitive systems and consumer AI. Part IV

 

Author: Susanne Lomatch

 

An intelligent agent (IA) is defined as an autonomous system that exhibits goal-directed intelligence. It is also defined as a rational agent or actor, in the sense of a rational decision maker specified by the von Neumann-Morgenstern expected utility theorem. Applications for IAs are significant, and include robotics and autonomous system control, data mining, trading, and automated virtual assistants.

 

IAs are based on agent models (single or multi-agent), which by themselves may not exhibit autonomous goal-directed intelligent behavior or action; i.e., these are emergent properties of an IA system architecture that includes knowledge representations and acquisition, machine learning and reasoning, control, planning and decision making at some minimum, and for many applications sensory input and active output at some defined level according to the specific goal(s) of the system.

 

Cognitive agents and architectures are sometimes classified as a subset of IAs and agent architectures, but this hierarchy ought to be reversed, or they should be viewed separately. For example, human cognitive function is not that of a strict rational agent [1], though the two may intersect, meaning cognitive agents can at times act as strict rational agents. Most IAs and their applications do not require affective states or models, so it makes sense even to have a hierarchical ordering here: IAs – affective IAs – cognitive systems. (An interesting paper discussing artificial affective agents can be found here: [2].)

 

Robotics focus on motor skills tied to external sensory and motor systems, and though intelligence is required in the “brain” or controls of such a system, such intelligence need not encompass the range found in the whole of a mammalian neocortical system to operate. Robots need not recognize language or speech, communicate, solve difficult problems, or think. Machine learning is very important, but natural language processing and higher-level reasoning are not. Android or “humanoid” robots may incorporate some of these characteristics, as technologies and system engineering capabilities improve. A timeline (as of 2011) of humanoid-type robotic developments can be found HERE.

 

One of the limitations surrounding traditional robotics is that it scales very poorly – motor and battery technologies have slowly matured, without major breakthroughs. Designers must start to look at novel, challenging approaches, such as artificial muscles (e.g. electroactive polymers or carbon nanotubes), and efficient, low power control processing capability, such as that offered by neuromorphic ASICs or FPGAs (see my comments on this in Part 3).

 

As an insight, general neocortical-inspired algorithmic frameworks such as hierarchical temporal memory (HTM) presented in the last review (Part 3) may be quite useful in developing complex robotics systems that integrate multiple sensors, motor controls, speech and language processing, and autonomous reasoning (recall HTM is a deep machine learning approach).

 

Brain-machine interfaces (BMIs), once a science fiction concept, are finally becoming a reality. On the noninvasive scale, relatively primitive “thought reading” or “mind reading” of conscious motion control, and the reconstruction of static and dynamic visual imagery in early and later stage sensory areas, have been demonstrated using EEG, fMRI, NIRS and sophisticated signal processing techniques. On the invasive scale, after years of groundbreaking research on monkey and rat cortical BMI implants, it appears quite possible to engineer and implant an invasive BMI to enable conscious (robotic) motor control for human para- or quadriplegics or patients with locked-in syndrome. Another initial (and recently demonstrated) application for invasive BMIs in humans includes monitoring brain seizures (e.g. epileptic), using advanced electrocorticographic (ECoG) devices. As seizure patterns from these devices are uncovered and understood with respect to the neocortical architecture, a similar BMI can be used to treat patients, much like a pacemaker. Invasive techniques have also been used on epileptics to reconstruct speech from the auditory cortex.

 

As an imaging tool, BMIs using advanced single neuron or local field potential devices may help us to better understand the neocortical-old brain architecture in detail (neural-synaptic firing patterns and locations, improved classification of neural cells, synapses, axons and dendrites, and most importantly, functional and anatomical connectivity and memory).

 

It is not difficult to imagine a BMI that would serve multiple purposes, such as providing increased memory and processing capacity, or a restorative memory/processing center for vision, hearing, speech, motor control, cross-modal cognitive function, or even novel prosthetic sensors, such as sonar, infrared, or enhanced smell. Training the brain to interface to an implantable machine or robotics is key. Advances in BMI technology require the span of multiple fields of research: biomedicine, neurology and cognitive neuroscience to bioengineering and neuromorphic engineering to materials science to electrical and mechanical engineering to robotics. As I argued in the initial essay of this series, Part 1, cognitive prosthetic BMIs are the most speculative area of neuroprosthetics – and share a great deal of potential coexistence with cognitive systems and architectures in AI.

 

Links to specific reviews in Part 4, located in this document:

      Intelligent Agents (including various autonomous systems and implementations)

      Honda’s ASIMO

      Brain-Machine or Brain-Robotic Interfaces

      References and Endnotes

 

(Disclaimer: The reviews in this article are meant to inform and entertain, and contain a healthy dose of critical appraisal. I encourage readers who find factual errors, or who have alternative intelligent appraisals and opinions, to contact me (contact link below). I will include any substantial feedback on the dialogue site area dedicated to AI (Click HERE for link to the AI dialogue area). I also welcome constructive and friendly comments, suggestions and dialogue.)

 

Intelligent Agents

 

Classification: Intelligent Agents and Robotics (Autonomous)

 

To illustrate the range of application of intelligent agents (IAs), I review three extremes: the driverless, autonomous vehicle system that won the 2007 DARPA Grand Challenge (a single agent IA), self-reconfiguring modular robots and self-replicating machines (single or multi-agent IA), and swarm robotics including nanorobots (multi-agent IA).

 

The DARPA Grand Challenge was a multiyear prize competition event from 2004-2007 to field a set of challengers to design and build fully driverless, autonomous ground vehicle systems that would navigate and complete a substantial off-road course within a limited time. The last event, the 2007 Urban Challenge, the course involved a 96 km (60 mi) urban area course, to be completed in less than 6 hours. Rules included obeying all traffic regulations while negotiating with other traffic and obstacles and merging into traffic.

 

The winning team, Tartan Racing (CMU, GM, Cat, ContiAG) with vehicle “Boss,” published a rough schematic of the system and their multi-modal approach [3]. The IA aspects of their system include: mission planning (determines an efficient route through an urban network of roads), behavior generation (executes the route through the environment, adapting to local traffic and exceptional situations as necessary), motion planning (safeguards the robot by considering the feasible trajectories available along the route, and selecting the best option), and perception/data fusion/detection (combines data from lidar, radar and vision systems to estimate the location of other vehicles, static obstacles and the shape of the road). The system also included “mechatronic” alterations that allowed for vehicle automation, auxiliary power, computing and sensor mounting.

 

Probably the most impressive part of the Tartan/Boss system is the behavior generation, which is integrally interfaced to the perception/data fusion/detection system. The situational assessment of the latter included estimating the “intention” of the tracked object (an obstacle or another vehicle). The planning and behavior modules also included a mechanism for predicting the potential positions of other vehicles 10 seconds into the future.

 

Tartan/Boss completed the course in 4h:10min at an average speed of 14mph. The second place challenger, Stanford/Junior, finished in ~4:30 at 13.7mph. There were 11 teams competing, and only four completed within the specified 6 hours and five did not finish the course. I recommend readers give [3] a read to get a flavor of what a successful autonomous vehicle system with IA elements entails. One can imagine that a more generalized system would require greater capacity for learning and memory of encountered “situations,” and be able to control behavior or action with planning and decision based on situational assessments (estimation, prediction).

 

Self-reconfiguring modular robots (SRMRs) are defined as capable of utilizing their own system of control such as with actuators, or chemical or biological or stochastic means to change their overall structural shape. They are autonomous kinematic machines with variable morphology. The IA aspect enters in how such systems interact with the environment to deliberately change their own shape by rearranging the connectivity of their modular parts, to adapt to new circumstances, perform new tasks, or recover from damage. The application of such a system might be for situational awareness, surveillance or exploration in variable and harsh environments, where an adaptable morphology might be an advantage. Architectures for such systems can include different structural geometries (lattice, chain, hybrid; distributed, parallel, serial), different structural reconfiguration control (deterministic, nondeterministic or stochastic), and different modular design (homogeneous, heterogeneous). An interesting implementation is stochastic cellular robotics, that self-assemble or self-reconfigure via passive, stochastic self-organization [4]. Such solid-state cellular units exploit ‘Brownian motion’ in their environment and require no local power or locomotion ability. Another implementation demonstrates modular self-disassembly. (Think of this novel application: a large shipment that self disassembles into deliverables that then automatically find their way to an end destination, and may even further disassemble along the way.) An implementation with consumer commercial potential is “Roombots,” modular robotics for adaptive and self-organizing furniture (click on link to see some examples), though I’m not sure how intelligent these things really are.

 

In Part 3 I reviewed a novel reconfigurable computing technology, memristive nanodevices, which allow for synaptic plasticity of a computational network implementation. Though this approach has limitations, such as a viable integrated encoding and processing neuronal unit that is also reconfigurable, solutions that address those limitations may allow for fully reconfigurable networks.

 

Self-replicating machines (SRMs) are defined as capable of autonomously manufacturing a copy of themselves using raw materials taken from their environment. A classic example is that of von Neumann’s universal constructor, a self-replicating machine that would operate in a cellular automata environment. More recent examples can be found in [5], though these are ostensibly non-biological systems. The caveat to these systems is that they are not inherently “intelligent.” The IA factor must be added somehow to their “DNA,” which defines their anatomical and functional structure and evolutionary capability (based on learning and memory). The same issue applies to SRMRs.

 

Swarm intelligence (SI) is defined as the collective behavior of decentralized, self-organized systems, natural or artificial. Swarm robots (SR) employ SI, and may even be SRMRs (e.g., the stochastic cellular robotic system described above) or SRMs. A popular example of late is that of nanorobotics, systems with nanoscale robots that interact to perform microscopic, macroscopic or even mesoscopic tasks [5].

 

The properties of autonomy, self-replication and self-reconfiguration, reliability and redundancy (fault tolerance), coupled with the goal-directed intelligence of an IA (single or multi-agent), are all important features for constructing a more generalized IA (single or multi-agent). Realizing this mix with a non-biological system is indeed a challenge.

 

One can see the conceptual and applied value of SRMRs, SRMs and SRs, from prosthetics (such as neuroprosthetics that may require neuroplastic features) to mega-structures in space (that become damaged by space junk or debris and must repair themselves for proper continual operation).

 

Honda’s ASIMO

 

Classification: Robotics (Autonomous) and Androids

 

Honda’s ASIMO probably represents the “state of the art” in commercial humanoid robotics, with its second generation sporting semi-autonomous capability.

 

According to Honda, ASIMO exhibits: “1) high-level postural balancing capability which enables the robot to maintain its posture by putting out its leg in an instant; 2) external recognition capability which enables the robot to integrate information, such as movements of people around it, from multiple sensors and estimate the changes that are taking place; 3) the capability to generate autonomous behavior which enables the robot to make predictions from gathered information and autonomously determine the next behavior without being controlled by an operator; 4) [an intelligent control system center that] comprehensively evaluates inputs from multiple sensors that are equivalent to the visual, auditory, and tactile senses of a human being, then estimates the situation of the surrounding environment and determines the corresponding behavior of the robot; 5) coordination between visual and auditory sensors enables ASIMO to simultaneously recognize a face and voice, enabling ASIMO to recognize the voices of multiple people who are speaking simultaneously; 6) a highly functional compact multi-fingered hand, which has a tactile sensor and a force sensor imbedded on the palm and in each finger, respectively, and which acts to control each finger independently. Combined with the object recognition technology based on visual and tactile senses, this multi-fingered hand enables the all-new ASIMO to perform tasks with dexterity, such as picking up a glass bottle and twisting off the cap, or holding a soft paper cup to pour a liquid without squishing it. Moreover, ASIMO is now capable of making sign language expressions which require the complex movement of fingers.

 

Some of the technical specs for ASIMO are found HERE, with the latest model sporting 57 degrees of joint freedom (humans have > 690 joint DOF, 230 joints @ 3 DOF each).

 

However impressive ASIMO is, it still is very limited in terms of just two key capacities: autonomous learning and self-maintenance (recharging itself). Humans spend years learning at various levels, both assisted and autonomous – and this learning time is necessary for skill and personality development. I am optimistic that robotic systems can be built that accelerate this learning process, and learned machine replication is an advantage. Not everyone shares this optimism:

 

Robots will kind of do one thing well, but we never will see a robot that makes a cup of coffee, never. I don't believe we will ever see it. Think of the steps that a human being has to do to make a cup of coffee and you have covered basically 10, 20 years of your lifetime just to learn it. So for a computer to do it the same way, it has to go through the same learning, walking [through] a house using some kind of optical with a vision system, stepping around and opening the door properly, going down the wrong way, going back, finding the kitchen, detecting what might be a coffee machine. You can't program these things, you have to learn it, and you have to watch how other people make coffee. This is a kind of logic that the human brain does just to make a cup of coffee. We will never ever have artificial intelligence. Your pet, for example, your pet is smarter than any computer.” –Steve Wozniak, co-founder of Apple Computer, Interview, August 2007.

 

No doubt advances in Intelligent Agent systems and architectures are required, as are advances in artificial muscles and low power control systems. Eyes on the prize, Woz.

 

Brain-Machine or Brain-Robotic Interfaces

 

Classification: AI Enablers

 

There are two general types of BMIs, noninvasive (nBMIs) and invasive (iBMIs). I review each separately.

 

Noninvasive BMIs (nBMIs)

 

NBMIs are sensor systems (arrays, scanning imagers) that measure neural signals external to the skull. Typical signals measured include electroencephalography (EEG) and all its waveforms and modal variants (slow cortical potential and P300), magnetoencephalography (MEG) and functional magnetic resonance imaging (fMRI).

 

EEG nBMIs are the most typical and have been commercialized for applications ranging from the P300 Speller and BCI2000 from Cortech Solutions, which measures P300 signals to enable conscious cursor control of a screen-based keyboard for communication by ALS or locked-in patients, to computer gaming interfaces (I reviewed several of these nBMI gaming systems in Part 2). These are head-mounted systems with arrays of either dry or wet-contact EEG biosensors coupled with application-specific signal processing software.

 

EEG-based nBMIs have many drawbacks, well cited in [6]: “The systems all rely on EEG measurements of the activities of millions of nerve cells – thereby making these approaches rather imprecise. The process could be compared to trying to hear the conversation between two individuals sitting in a packed sports stadium with the use of a directional microphone located in the parking lot. Wouldn’t it be much more practical to listen in to the conversations among nerve cells from a closer range?

 

MEG arrays and fMRI scanning imagers offer more precise noninvasive spatially resolved neurofeedback; these systems are bulky and expensive, making them unsuitable candidates for practical BMIs, but useful tools in research and clinical diagnostic settings.

 

MEG (and EEG) signals derive from the net effect of ionic currents flowing in the dendrites of neurons during synaptic transmission: MEG is thought to derive from intracellular ionic currents, and EEG from extracellular ionic currents. However, MEG signals are very weak (10-1000 fT – research indicates that approximately 50,000 active neurons are needed to generate a detectable MEG signal), and therefore the detection technology is based on arrays of sensitive superconducting quantum interference devices (SQUIDs). MEG/EEG have comparable temporal resolution, <=1 millisecond (vs. 1000s of ms for fMRI). Since the MEG signal is less distorted by the skull/scalp than EEG, it offers better spatial resolution, albeit to more limited spatial/functional regions in the brain due to detection of only tangential components of the field and not radial and tangential (EEG). The power of MEG appears to be in its combined use with EEG and fMRI to form a more complete image of neural activity. Clinical use for epileptic surgery is one immediate application, as MEG offers a noninvasive route to localizing epileptogenic tissue and tumors.

 

FMRI scanning imagers measure changes in blood flow as a result of neural activity and dominate brain mapping research in identifying functional neural regions linked to sensing, speaking, moving, learning, memory and higher cognitive functions such as planning, decision making and language. Unlike EEG/MEG, which are biased toward the cortical surface, fMRI can record functional events from most/all spatial regions. The primary implementation of fMRI, blood-oxygen level dependent (BOLD) contrast, measures hemodynamic response (HDR) from neuronal events to millimeter spatial resolution and 1-2 second temporal resolution. Most fMRI experiments study brain processes lasting a few seconds, with the study conducted over some tens of minutes.

 

FMRIs have been used with sophisticated signal detection algorithms to reconstruct (“decode”) cortical representations of static and dynamic visual imagery, along with other sensory stimuli.

 

A group led by the ATR Computational Neuroscience Labs was able to reconstruct “constraint-free” static visual images in the brain by decoding BOLD contrasts associated with correlated fMRI activity, suggesting that information representation in the brain can be discerned from multivoxel fMRI patterns [7]. This work had differed from other previous fMRI studies that predicted a subjective perceptual state from fMRI activity patterns through statistical classification-decoding, which relied on learning the mapping between a brain activity pattern and a stimulus category from a training data set, or by classifying brain activity into prespecified categories. Such a simple classification approach is insufficient to capture the complexity of more constraint-free visual imagery that is perceived. This work focused on decoding fMRI signals from the “early” visual cortex (primarily V1, V2, V3), motivated by the idea that subjective states that are elicited without sensory stimulation, such as visual imagery, illusions, and dreams, occur in the early visual cortex, consistent with the retinotopy map. The results suggested that the early visual cortex, particularly V1, represents the visual field not just by its ordered retinotopic mapping, but also by correlated activity patterns. To achieve higher accuracy, multiscale reconstruction was performed (a visual image is thought to be represented at multiple spatial scales in the visual cortex, which may serve to retain the visual sensitivity to fine-to-coarse patterns at a single visual field location).

 

A group at Berkeley followed up this work by measuring fMRI signals from early and anterior visual cortex to reconstruct more complex natural images, using a Bayesian decoder that predicted accurate reconstructions using a structural encoding model that characterizes responses in early visual areas, a semantic encoding model that characterizes responses in anterior visual areas, and prior information about the structure and semantic content of natural images [8]. As stated by the authors, the Bayesian framework makes it possible to disentangle the contributions of functionally distinct visual areas and prior information to reconstructing the structural and semantic content of natural images. The results showed that much of the variance in the responses of anterior visual areas (specifically areas anterior to intermediate visual areas V3A/V3B/V4/posterior lateral occipital, or the anterior lateral occipital and the anterior occipital cortex) to complex natural images is explained by the semantic category of the image alone. (This perhaps confirms more evidence that the anterior areas of the visual cortex process “higher level” or “higher order” information in a hierarchical paradigm.)

 

More recently, the Berkeley group resolved/decoded dynamic visual imagery using a model that describes fast visual information and slow hemodynamics (standard BOLD fMRI signals) by separate components, revealing how information from dynamic imagery (movies) is represented on a multiscale level [9]. Their results were improved using Bayesian inferencing techniques to make predictions about perceived imagery.

 

However impressive it is to use fMRI and encoding/decoding models to predict information about neural activity/precept stimuli and to learn about how information is represented in the brain (see [8] for a complete review of these fMRI encoding/decoding techniques from multiple groups), the technique is limited by the technical issues surrounding fMRI data, namely that of noise and spatiotemporal resolution. FMRI is not able to accurately resolve activity within cortical layers; we know from invasive research that there is hierarchical processing of information between layers and between functional areas. Resolving this complexity to “decode” the what, where and how of information represented (“encoded”) in the brain at various levels of processing is key to more accurate reconstruction of sensed information and “thoughts” of such information. FMRI at best can give us a relatively qualitative view of functional network reconstruction, compared to what might be accomplished with other more precise and/or invasive techniques.

 

On that note, I have found one study [27] that integrated a large volume of fMRI data from multiple studies, and through “hypothesis-free” data analysis the authors concluded that there exists a global subdivision of the cortical topography into extrinsic and intrinsic processing systems. This study followed an earlier study constrained by pre-determined hypotheses, in which the authors proposed that the human cerebral cortex could be envisioned as a hierarchy of neuroanatomical subdivisions, starting from large networks of areas at the most global scale, and ending in columnar subdivisions within individual areas. In the earlier study, the authors found that the widespread cortical activity elicited during free-viewing a movie subdivided the posterior cortical mantle into two major networks. The first network, labeled extrinsic, included major parts of sensory–motor cortex, was robustly activated by the natural movie stimuli. Embedded among the externally activated areas of the extrinsic network, the authors found “islands” that formed a different, complementary network. While activity in these regions was not driven by external stimulation, it showed a high degree of interregional correlation, which suggested a common function. Since this activity was seemingly internally driven, the authors labeled this network the “intrinsic” system, overlapping substantially with a “default-mode” network, and likely associated with inner-oriented processing. The major components of the intrinsic system included medial prefrontal areas, the posterior cingulate and the precuneus, lateral inferior parietal cortex and the anterior aspect of infero-temporal cortex.

 

FMRI resolves primarily the functional structure of brain activity, while it is also useful to know and correlate to the anatomical structure. Diffusion MRI (dMRI) maps the diffusion process of molecules (such as water) in brain tissue, and diffusion tensor imaging (DTI) images the directional diffusion in structurally aligned tissue. DMRI and DTI have been used to reveal abnormalities in white matter fiber structure and provide maps of brain connectivity. This ability to visualize anatomical-structural connections between different parts of the brain using dMRI and DTI can be combined with fMRI and diffusional fMRI (DfMRI) to map the human connectome on a regional-functional (macro scale) level. This grand task will allow the mapping of data to models that relate anatomical-structural connectivity to functional connectivity: the complete picture is the end goal.

 

In clinical use, fMRI can help identify the effects of tumors, stroke, head and brain injury, or diseases such as Alzheimer's on normal functional regions. Clinical use is complex, as tumors and lesions can change blood flow in ways not related to neural activity, and drugs can affect HDR. These complications, along with limitations on spatiotemporal resolution, motivate the combined use of fMRI with other imaging modalities such as EEG/MEG, near-infrared spectroscopy NIRS and transcranial magnetic stimulation (TMS) for both clinical and research use.

 

EEG can be used simultaneously with fMRI for higher spatiotemporal resolution, but there are numerous technical difficulties including normalizing each signal to one another to correlate with similar neural activity and removing any cross-modal interference effects in the data. EEG can be used simultaneously with NIRS without major technical difficulties. There is no influence of these modalities on each other and a combined measurement can give useful information about electrical activity as well as local hemodynamics. NIRS measures the changes in blood hemoglobin concentrations associated with neural activity, but primarily at the cortical level; fMRI measures activation throughout the brain. NIRS offers much greater portability and wireless control over bulky, expensive fMRI technology.

 

Honda and ATR have led a team to develop nBMI technology enabling control of a robot by human thought [10] using EEG and NIRS. Statistical signal processing is analyzed for unique brain activity relating to thoughts of physical motion (likely using fused signal detection algorithms). Accuracy of over 90% was achieved with this system in 2009.

 

While noninvasive imaging techniques resolve the functional dynamics of ensembles of thousands to millions of neurons and neuronal-synaptic activity at the macro-scale, spatiotemporal resolution of cellular and micro-scale neuronal-synaptic events will likely require invasive technologies that offer a more direct interface. Likewise, EEG/NIRS is limited in its use for limb prostheses, as the measured signals represent the average electrical and hemodynamic activity of broad populations of neurons, whereas extraction of finer variations in motor signaling is required to control precise arm and hand movements.

 

Invasive BMIs (iBMIs)

 

Most early research using invasive implants/iBMIs has been done on non-human mammals, in particular rats, cats and monkeys (owl and macaque), in the order of brain size/complexity. The macaque offers the most promising research subject given the relative complexity and deep furrow/convoluted structure of their neocortex-old brain (resembling that of humans).

 

Reconstruction of vision imagery in cats using direct implants in the cat thalamus (targeting 177 brain cells in the thalamus lateral geniculate nucleus area, which decodes signals from the retina) preceded [11] the much later noninvasive work on humans using fMRI discussed above, which concentrated on activity in the human visual cortex.

 

Nicolelis’ work on training rats and monkeys to control levers and robotic arms with measured motor cortex signaling from neural ensembles using microwire arrays, combined with modeling of how that signaling would control a lever or arm, is a highly worthy read [12] and contains a review of prior work. The revelation was that the activity of large neural ensembles could be used to predict arm position. This same review showed a sketch of how a human motor prosthetic iBMI system might be constructed based on the macaque research and using advances in neuromorphic chips. More recent work focused on functional iBMIs that exhibit closed-loop feedback, resolve signals from smaller neuronal ensembles, enable finer motor control using cross-modal visual tracking, and that measure premovement activity based on anticipation. See also a 2006 review by Lebedev and Nicolelis [13].

 

Nicolelis’ groundbreaking work has focused on measuring, modeling and decoding neural ensembles, specifically in real-time for practical BMI application. It is useful to compare the iBMI approaches to real-time decoding with the measurement, modeling and decoding experiments using nBMIs, particularly fMRI and cross-modal techniques, as discussed earlier. Invasive techniques have a clear advantage in terms of getting closer to the real-time signal sources. Berger’s work (below) on spike train neural decoding is also a key area of iBMI research.

 

Since Nicolelis’ work and that of others in the mid-2000s, there have been several noteworthy animal research efforts on iBMIs for motor prosthetic control. Velliste et al. [14] demonstrated the use of cortical signals to control a multi-jointed prosthetic device for direct real-time interaction with the physical environment (‘embodiment’), specifically applied to macaque monkeys using their motor cortical activity to control a mechanized arm replica in a self-feeding task. This was a first demonstration of multi-degree-of-freedom embodied prosthetic control in which physical object interaction was included (and as the authors pointed out, cannot be fully simulated). Wang et al. [15] demonstrated that both hand translation and hand rotation can be decoded simultaneously from a population of motor cortical neurons in the proximal arm area of primary motor cortex (M1). Moritz et al. [16] were the first to demonstrate that direct artificial connections between cortical cells and muscles can compensate for interrupted physiological pathways and restore volitional control of movement to paralysed limbs in macaque monkeys.

 

Perhaps the most interesting animal iBMI research to date is that of Berger et al., on a cortical prosthesis for restoring and enhancing memory in rats [17] (i.e., artificial hippocampus). After using pharmacological agents to disable the rats’ neural circuitry between two subregions of the hippocampus, CA1 and CA3, which interact to create long-term memory, the team demonstrated that a prosthesis could duplicate the pattern of interaction between CA3-CA1 interactions by monitoring the neural spikes in these cells with an electrode array, and then playing back the same pattern on the same array, thereby restoring long-term memory (in this case the memory to pull a particular lever for reward). Specifically, their system monitored the input pattern to hippocampus during the information encoding phase of the task, predicted accurately the associated hippocampal output pattern and the degree of success related to such encoding using a “multi-input, multi-output” (MIMO) nonlinear model, and delivered stimulation with electrical pulses during the same phase of the task in a pattern that conforms to the normal firing of the hippocampal output region on successful trials. Their system and model was also used to detect and enhance weak signals in the mnemonic encoding processing in normally functioning rats (without pharmacological disruption). The team plans to port this work to monkey subjects that have a more complex hippocampus, and likely more complex signaling patterns for encoding memory. The question is how generalizable encoding models will be for increasingly complicated neural architectures and tasks. The challenge does indeed seem to lie with having “sufficient information about the neural coding of memories.”

 

IBMI research on humans has been slow and localized to either epileptic patients, or patients suffering from stroke, paralysis, locked-in syndrome or complete vision and hearing loss. (Research on healthy subjects does indeed present ethical consideration, though I wonder whether there are some out there freely willing to sacrifice for the advance of the cause.)

 

In the last decade or so, iBMI human implants have demonstrated very limited motor control (via electrocorticographic (ECoG) devices implanted near functional-specific areas of the motor cortex, see e.g., [18-20]) and more trials are planned (using both ECoG and single neuron devices, see e.g., [21]).

 

Impediments to iBMI advances for neuroprosthetics, or even real-time imaging of neuronal-synaptic activity, are largely twofold. The first impediment is designing an implant that will provide high quality signal but that is robust against scar-tissue buildup; this is especially true for single neuron or local field potential devices that are inserted between layers of the cortex (intraparenchymal), but also applies to electrocorticographic (ECoG) devices implanted epidural or subdural [22]. This is a bioengineering, materials science, and electrical and mechanical engineering issue.

 

The second impediment is knowing precisely where to implant the iBMI and what to measure (and then stimulate, depending on if the application is “neuroplastic” rehabilitation [23]). This is a neuroscience issue, and one that I believe is the grand challenge, as we still have very limited understanding of the where and how of sensory and motor representations in the brain, let alone speech and language representations. There is also the outstanding question of the hierarchical distribution of these representations, between just sensed information or primitive motor commands and that of higher order “invariant” representations, linking short and long-term memory into the overall picture. A complete picture requires understanding how information is processed and represented (“encoded”) between cortical layers and between functional areas of the cortex, with other key regions included, such as the thalamus and hippocampus. This is a puzzle of distributed, and likely nested, hierarchies. An iBMI that reads out actual higher-level thoughts, as opposed to lower-level sensed information, motor and speech commands, may occur at different loci in the brain. (This goes back to some of the comments and quotes in the first essay, Part 1.) Likewise, for neuroplastic rehabilitative purposes, knowing where and how to stimulate to restore or enhance function has its own technical considerations [23].

 

Research within just the last year or two on epileptic patients using ECoG devices, and on in vivo/live tissue using single neuron/local field devices, has yielded advances in the first impediment. 

 

A diverse team of engineering, materials science and neuroscience/neurosurgery researchers led by NYU Polytech and UPenn recently developed a high density, flexible electrode array that integrates 750 ultrathin (260nm), flexible silicon nanomembrane transistors that act as amplified and multiplexed sensors to measure ECoG signal patterns of activity before and during an epileptic seizure at a very fine scale, with broad coverage of the brain [24]. The array conforms to the brain’s complex shape, even reaching into groves that are inaccessible to conventional arrays. With further engineering, the array could be rolled into a tube and inserted onto the brain through a small hole, rather than an opening in the skull. The flexible array has been tested on cats, in a variety of clinical conditions (viewing objects, sleeping, anesthetized, drug-induced seizure states). Researchers measured recurrent spiral waves that propagate in the neocortex in correlation to seizures; such waves are similar to those that ripple through the cardiac muscle during ventricular fibrillation, opening the possibility that the same iBMI device might be used to both monitor and treat epileptic seizures like a pacemaker, an advantage for patients that do not respond completely to drugs for epileptic seizure control.

 

A group from Harvard investigating single neuron devices developed Si nanowire field-effect transistor (NWFET) arrays to map neural circuits in acute brain slices [25], obtaining higher spatiotemporal resolution (sub-ms, <=30µm) than previous microcircuit devices and other imaging techniques. The high input impedance of NWFETs circumvents the common challenge confronted by implanted microelectrodes, where post-implant increases of impedance (e.g., from absorption of proteins) leads to degraded signal quality and higher noise level. Small device feature size (<=0.25µm) allows multiplexed detectors being integrated onto ultrasmall probes for minimum damage to the tissue; bottom-up fabrication makes it possible to choose biocompatible materials as substrates to reduce mechanical mismatch and to minimize reactive tissue response; and the nanoscale morphology could promote better attachment of active neurons, leading to better signal quality than planar designs. These advantages make this technology suitable for high signal yield, chronic in vivo recordings (interfacing live tissue/subjects). The subcellular (<= 10µm) resolution observed suggests that the nanoscale, nanostructured detectors, which are comparable in size to synaptic connections and can promote tissue-device interactions, provide the direct capability for high-resolution signal mapping and stimulation, modalities required to both map neural-synaptic architecture and allow for direct communication interfaces.

 

As expected, research on the second impediment to iBMIs, knowing where to implant and what to measure (or stimulate), is at a much slower pace. Research has concentrated on specific functional areas of the cortex, measuring “early stage” sensory information (what I termed lower-level sensory information, following a hierarchical paradigm), but also more recently, “later stage” information based on activity in intermediate and anterior areas (and perhaps higher-level representations).

 

One very recent piece of work that got quite a bit of media attention involved invasive measurements on 15 epileptic patients, measuring signals from the nonprimary auditory cortex in the superior temporal gyrus (STG) to determine what acoustic information in speech sounds can be reconstructed from population neural activity [26]. The posterior STG (pSTG) is thought to play a critical role in transforming acoustic information into phonetic and pre-lexical representations, part of the stream of intermediate or higher-order processing. Prior work on early stage processing, from the cochlea to the primary auditory cortex (A1), yielded accurate representation of the spectro-temporal properties of the sound waveform, including those acoustic cues relevant for speech perception, such as formants, formant transitions, and syllable rate. In this work on the intermediate and later stage processing, measured signals were in the form of cortical surface potentials, or ECoG signals, recorded from non-penetrating multi-electrode arrays placed over the lateral temporal cortex, including the pSTG. The decoded speech representations allowed readout and identification of individual words directly from brain activity during single trial sound presentations, revealing neural encoding mechanisms of speech acoustic parameters in the STG.

 

It is interesting to note that when this research was published, the mainstream media immediately popularized it as “mind reading” or “thought reading,” when in fact this is really decoding of early and intermediate stage (auditory) information that is sensed/processed in the brain. As one of the research authors stated in an interview: “This research is not mind-reading or thought-reading — we can decode the sounds people are actually listening to but we cannot tell what they are thinking to themselves. Some evidence suggests that the same brain regions activate when we listen to sounds and when we imagine sounds, but we don't yet have a good understanding of how similar those two situations really are.”

 

Indeed, we have miles to go to understand the complete but complex puzzle, but this research is representative among the first fine steps.

 

These advances in BMI technologies form the revolutionary beginnings to those of us who only dreamed of such devices a few short years ago. Conceptualization of BMIs that enable replacement of lost neural-sensory/motor function (vision, hearing, mobility, etc.), addition of novel prosthetic sensors or bionics/artificial limbs with direct neural control, and the addition of improved human memory/processing capability, is finally reaching a realm of possibility.

 

To ‘measure’ thoughts, it is not enough to simply record which nerve cells are firing together and which electrochemical processes accompany this event. One also has to know what these cells represent for that individual – say whether the neural impulses in the hippocampus stand for a pleasant or unpleasant past experience with cappuccino. This memory-signal recognition requires registering the activities of millions of individual nerve cells, and such imaging is not yet possible even with the most advanced visual technologies and invasive means of measuring brain activity.” [6]

 

An example list of BMI research groups (by no means complete, and will be revised periodically):

 

BMI research groups: Nicolelis/Duke, Berger/USC, Schwartz/UPittCNBC/CMU, Viventi/NYU-Poly (see also a good list of pubs and talks HERE), Litt/UPenn, Lucas/PennMed, ATR, Gallant/Berkeley, CNEL/UFla, (a generic international list is located HERE)

 

References and Endnotes:

 

[1] “Thinking, Fast and Slow,” Daniel Kahneman, Farrar, Straus and Giroux, 2011. (Humans are not rational agents; in particular, in solving elementary problems that involve uncertain reasoning: See list of cognitive biases for several examples.)

[2] “Architectural Roles of Affect and How to Evaluate Them in Artificial Agents,” M. Scheutz, Tufts, 2011.

[3] “Tartan Racing: A Multi-Modal Approach to the DARPA Urban Challenge, C. Urmson et al., Carnegie-Mellon, 2007.

[4] “Stochastic Self-Reconfigurable Cellular Robotics,” P.J. White et al., IEEE International Conference on Robotics and Automation, 2004.

[5] “Kinematic Self-Replicating Machines,” Robert A. Freitas Jr. and Ralph C. Merkle, Landes Bioscience, 2004. See also “Self-Replication and Nanotechnology,” R. Merkle/Zyvex. R. Merkle’s nanotechnology page at Zyvex also contains quite a few interesting links and tutorials on the subject matter.

[6] “Thinking Out Loud,” N. Neumann and N. Birbaumer, Sci. Am. Mind, vol 14 (5), Dec. 2004. (Also reprinted in “The Best of the Brain,” ed. F. Bloom, Sci Am. Books, 2007.)

[7] “Visual Image Reconstruction from Human Brain Activity using a Combination of Multiscale Local Image Decoders,” Yoichi Miyawaki et al., Neuron, vol 60, p.915, Dec. 2008.

[8] “Bayesian Reconstruction of Natural Images from Human Brain Activity,” T. Naselaris et al., Neuron vol. 63, p.902, Sept. 2009. Readers may also appreciate a review from this group on “Encoding and Decoding in fMRI,” available from the same link.

[9] “Reconstructing Visual Experiences from Brain Activity Evoked by Natural Movies,” S. Nishimoto et al., Current Biology, vol. 21, p.1, Oct. 2011.

[10] “Honda, ATR and Shimadzu Jointly Develop Brain-Machine Interface Technology Enabling Control of a Robot by Human Thought Alone,” Honda Press Release, Mar. 2009.

[11] “Reconstruction of natural scenes from ensemble responses in the lateral geniculate nucleus,” G.B. Stanley et al., Journal of Neuroscience, vol. 19 (18) p.8036, Sept. 1999.

[12] “Controlling Robots with the Mind,” M. Nicolelis and J.K. Chapin, Sci. Am., vol. 287 (4), Oct. 2002. (Also reprinted in “The Best of the Brain,” ed. F. Bloom, Sci Am. Books, 2007.)

[13] “Brain-machine interfaces: past, present and future,” M.A. Lebedev and M.A. Nicolelis, Trends in Neurosciences, vol. 29 (9), p.536, Sept. 2006.

[14] “Cortical control of a prosthetic arm for self-feeding,” M. Velliste et al., Nature vol. 453, p.1098, Jun. 2008.

[15] “Motor Cortical Representation of Hand Translation and Rotation during Reaching,” W. Wang et al., Journal of Neuroscience, vol. 30(3), p.958, Jan. 2010.

[16] Direct control of paralysed muscles by cortical neurons,” Moritz et al., Nature, vol. 456, p.639, Dec. 2008.

[17] “A cortical neural prosthesis for restoring and enhancing memory,” T.W. Berger et al., J. Neural Eng., vol. 8, Jun. 2011.

[18] Electrocorticographically controlled brain–computer interfaces using motor and sensory imagery in patients with temporary subdural electrode implants: Report of four cases,” E.A. Felton et al., J. Neurosurg, vol. 106, p.495, 2007.

[19] “Two-dimensional movement control using electrocorticographic signals in humans,” G. Schalk et al., J. Neural Eng., vol. 5, p.75, 2008.

[20] “Cortical activity during motor execution, motor imagery, and imagery-based online feedback,” K.J. Miller et al., PNAS, vol. 107, p.4430, Mar. 2010.

[21] “Human Trials Planned For Brain Computer Interface,” M. Peck, IEEE Spectrum, Apr. 2011. Note this is the same group that conducted the research in [14].

[22] “Evolution of brain-computer interfaces: going beyond classic motor physiology,” E.C. Leuthardt et al., Neurosurg Focus 27 (1), 2009.

[23] “Neural Interface Technology for Rehabilitation: Exploiting and Promoting Neuroplasticity,” Wang et al., Phys. Med. Rehabil. Clin. N. Am, vol. 21, p. 157, 2010.

[24] “Flexible, foldable, actively multiplexed, high-density electrode array for mapping brain activity in vivo,” J. Viventi, et al., Nature Neuroscience, v.14, p1599, Nov. 2011. See also “Ultrathin, flexible brain implant offers better look at seizures.

[25] “Nanowire transistor arrays for mapping neural circuits in acute brain slices,” Q. Qing, et al., Proc. NAS, vol. 107(5), p1882, Feb. 2010.

[26] “Reconstructing Speech from Human Auditory Cortex,” B. Pasley et al., PLoS Biology, vol. 10 (1), p.1, Jan. 2012.

[27] “Data-driven clustering reveals a fundamental subdivision of the human cortex into two global systems,” Y. Golland et al., Neuropsychologia vol. 46, p.540, 2008.

About | Disclaimer | RSS Feed | | ©2012 EidolonSpeak.com