Neurocognitive Graph Theory
Joshua Vogelstein
Johns Hopkins University
March 3, 2009
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 1 / 16
Outline
1 Introduction
2 Theory
3 Applications
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 2 / 16
Introduction
Outline
1 Introduction
2 Theory
3 Applications
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 3 / 16
Introduction
Motivation
Motivation
Brains are well represented by graphs
If we believe in “mind-brain-supervenience”, then a person’s brainmust contain all the information stored in that person’s memory
My previous work has been devoted to inferring graphs from neuralactivity
Thus, given such graphs for a particular human, we could, inprinciple, decode her graph, to learn what she knows
This obviates the need to query people, as, instead, we could simplyquery the graph
This technology could even be employed posthumously
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 4 / 16
Introduction
Preliminary Plans
Preliminary Plans
Build a universally consistent classifier for neurocognitive graphs
Classify behaviorally distinct populations of drosophila
Identify humans with highly developed skills in certain areas (likemusical ability)
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 5 / 16
Theory
Outline
1 Introduction
2 Theory
3 Applications
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 6 / 16
Theory
What is a graph?
Basic terminology
Nodes — neurons, voxels, etc.
Edges — synapses, axonaltracts, etc.
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 7 / 16
Theory
A neurocognitive classifier
Formal definitions
The space of graphs is defined by {G = (V ,E ) : G ∈ G}
The space of actions is defined by {a : a ∈ A}The space of stimuli is defined by {s : s ∈ S}The decision policy of the agent is defined by δ : S × G → AOur goal is to find a classifier, gs(·) = δ(s, ·), s.t. we can predict theaction of an agent in response to any particular stimulus, s
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 8 / 16
Theory
A neurocognitive classifier
Formal definitions
The space of graphs is defined by {G = (V ,E ) : G ∈ G}The space of actions is defined by {a : a ∈ A}
The space of stimuli is defined by {s : s ∈ S}The decision policy of the agent is defined by δ : S × G → AOur goal is to find a classifier, gs(·) = δ(s, ·), s.t. we can predict theaction of an agent in response to any particular stimulus, s
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 8 / 16
Theory
A neurocognitive classifier
Formal definitions
The space of graphs is defined by {G = (V ,E ) : G ∈ G}The space of actions is defined by {a : a ∈ A}The space of stimuli is defined by {s : s ∈ S}
The decision policy of the agent is defined by δ : S × G → AOur goal is to find a classifier, gs(·) = δ(s, ·), s.t. we can predict theaction of an agent in response to any particular stimulus, s
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 8 / 16
Theory
A neurocognitive classifier
Formal definitions
The space of graphs is defined by {G = (V ,E ) : G ∈ G}The space of actions is defined by {a : a ∈ A}The space of stimuli is defined by {s : s ∈ S}The decision policy of the agent is defined by δ : S × G → A
Our goal is to find a classifier, gs(·) = δ(s, ·), s.t. we can predict theaction of an agent in response to any particular stimulus, s
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 8 / 16
Theory
A neurocognitive classifier
Formal definitions
The space of graphs is defined by {G = (V ,E ) : G ∈ G}The space of actions is defined by {a : a ∈ A}The space of stimuli is defined by {s : s ∈ S}The decision policy of the agent is defined by δ : S × G → AOur goal is to find a classifier, gs(·) = δ(s, ·), s.t. we can predict theaction of an agent in response to any particular stimulus, s
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 8 / 16
Theory
A neurocognitive classifier
Building a universally consistent classifier
The probability of an error is: L(g) = P(g(X ) 6= Y )
A Bayes optimal classifier satisfies: g∗ = argming L(g)
A universally consistent classifier converges to g∗, given infinite data,i.e., g → g∗ as n→∞k-nearest neighbor (kNN) classifier is universally consistent
but X is assumed to be in Rd
our X is a graph, and therefore in G
Thus, we must prove the existence of a universally consistent classifieron graphs
Note that this classifier could potentially be a very general tool, foranybody classifying graphs
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 9 / 16
Theory
A neurocognitive classifier
Building a universally consistent classifier
The probability of an error is: L(g) = P(g(X ) 6= Y )
A Bayes optimal classifier satisfies: g∗ = argming L(g)
A universally consistent classifier converges to g∗, given infinite data,i.e., g → g∗ as n→∞k-nearest neighbor (kNN) classifier is universally consistent
but X is assumed to be in Rd
our X is a graph, and therefore in G
Thus, we must prove the existence of a universally consistent classifieron graphs
Note that this classifier could potentially be a very general tool, foranybody classifying graphs
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 9 / 16
Theory
A neurocognitive classifier
Building a universally consistent classifier
The probability of an error is: L(g) = P(g(X ) 6= Y )
A Bayes optimal classifier satisfies: g∗ = argming L(g)
A universally consistent classifier converges to g∗, given infinite data,i.e., g → g∗ as n→∞
k-nearest neighbor (kNN) classifier is universally consistent
but X is assumed to be in Rd
our X is a graph, and therefore in G
Thus, we must prove the existence of a universally consistent classifieron graphs
Note that this classifier could potentially be a very general tool, foranybody classifying graphs
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 9 / 16
Theory
A neurocognitive classifier
Building a universally consistent classifier
The probability of an error is: L(g) = P(g(X ) 6= Y )
A Bayes optimal classifier satisfies: g∗ = argming L(g)
A universally consistent classifier converges to g∗, given infinite data,i.e., g → g∗ as n→∞k-nearest neighbor (kNN) classifier is universally consistent
but X is assumed to be in Rd
our X is a graph, and therefore in GThus, we must prove the existence of a universally consistent classifieron graphs
Note that this classifier could potentially be a very general tool, foranybody classifying graphs
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 9 / 16
Theory
A neurocognitive classifier
Building a universally consistent classifier
The probability of an error is: L(g) = P(g(X ) 6= Y )
A Bayes optimal classifier satisfies: g∗ = argming L(g)
A universally consistent classifier converges to g∗, given infinite data,i.e., g → g∗ as n→∞k-nearest neighbor (kNN) classifier is universally consistent
but X is assumed to be in Rd
our X is a graph, and therefore in GThus, we must prove the existence of a universally consistent classifieron graphs
Note that this classifier could potentially be a very general tool, foranybody classifying graphs
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 9 / 16
Theory
A neurocognitive classifier
Building a universally consistent classifier
The probability of an error is: L(g) = P(g(X ) 6= Y )
A Bayes optimal classifier satisfies: g∗ = argming L(g)
A universally consistent classifier converges to g∗, given infinite data,i.e., g → g∗ as n→∞k-nearest neighbor (kNN) classifier is universally consistent
but X is assumed to be in Rd
our X is a graph, and therefore in G
Thus, we must prove the existence of a universally consistent classifieron graphs
Note that this classifier could potentially be a very general tool, foranybody classifying graphs
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 9 / 16
Theory
A neurocognitive classifier
Building a universally consistent classifier
The probability of an error is: L(g) = P(g(X ) 6= Y )
A Bayes optimal classifier satisfies: g∗ = argming L(g)
A universally consistent classifier converges to g∗, given infinite data,i.e., g → g∗ as n→∞k-nearest neighbor (kNN) classifier is universally consistent
but X is assumed to be in Rd
our X is a graph, and therefore in GThus, we must prove the existence of a universally consistent classifieron graphs
Note that this classifier could potentially be a very general tool, foranybody classifying graphs
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 9 / 16
Theory
A neurocognitive classifier
Building a universally consistent classifier
The probability of an error is: L(g) = P(g(X ) 6= Y )
A Bayes optimal classifier satisfies: g∗ = argming L(g)
A universally consistent classifier converges to g∗, given infinite data,i.e., g → g∗ as n→∞k-nearest neighbor (kNN) classifier is universally consistent
but X is assumed to be in Rd
our X is a graph, and therefore in GThus, we must prove the existence of a universally consistent classifieron graphs
Note that this classifier could potentially be a very general tool, foranybody classifying graphs
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 9 / 16
Theory
Universally consistent classifiers
k-nearest neighbor classifier
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 10 / 16
Theory
Building a universally consistent classifier on graphs
Three potential ways forward
Define a generative model for graphs (randomly grown graphs)
Define the distribution of naıve (initial) graphs P(G0)Define the transition distribution for our graphs, P(Gt |Gt−1)From this Markov process, we can compute the likelihood oftransitioning from any one graph to any other
Define a mapping from graphs to strings, and compute the Hammingdistance between the two strings
Uses the stationary distribution, π = Pπ, where P is the |V | × |V |matrix that defines the graphThis is very general, and could apply in theory to any graph
Project the graph onto a lower dimensional subspace, count thenumber of edges in each dimension, and then use the typical kNNclassifier in Rd
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 11 / 16
Theory
Building a universally consistent classifier on graphs
Three potential ways forward
Define a generative model for graphs (randomly grown graphs)
Define the distribution of naıve (initial) graphs P(G0)
Define the transition distribution for our graphs, P(Gt |Gt−1)From this Markov process, we can compute the likelihood oftransitioning from any one graph to any other
Define a mapping from graphs to strings, and compute the Hammingdistance between the two strings
Uses the stationary distribution, π = Pπ, where P is the |V | × |V |matrix that defines the graphThis is very general, and could apply in theory to any graph
Project the graph onto a lower dimensional subspace, count thenumber of edges in each dimension, and then use the typical kNNclassifier in Rd
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 11 / 16
Theory
Building a universally consistent classifier on graphs
Three potential ways forward
Define a generative model for graphs (randomly grown graphs)
Define the distribution of naıve (initial) graphs P(G0)Define the transition distribution for our graphs, P(Gt |Gt−1)
From this Markov process, we can compute the likelihood oftransitioning from any one graph to any other
Define a mapping from graphs to strings, and compute the Hammingdistance between the two strings
Uses the stationary distribution, π = Pπ, where P is the |V | × |V |matrix that defines the graphThis is very general, and could apply in theory to any graph
Project the graph onto a lower dimensional subspace, count thenumber of edges in each dimension, and then use the typical kNNclassifier in Rd
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 11 / 16
Theory
Building a universally consistent classifier on graphs
Three potential ways forward
Define a generative model for graphs (randomly grown graphs)
Define the distribution of naıve (initial) graphs P(G0)Define the transition distribution for our graphs, P(Gt |Gt−1)From this Markov process, we can compute the likelihood oftransitioning from any one graph to any other
Define a mapping from graphs to strings, and compute the Hammingdistance between the two strings
Uses the stationary distribution, π = Pπ, where P is the |V | × |V |matrix that defines the graphThis is very general, and could apply in theory to any graph
Project the graph onto a lower dimensional subspace, count thenumber of edges in each dimension, and then use the typical kNNclassifier in Rd
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 11 / 16
Theory
Building a universally consistent classifier on graphs
Three potential ways forward
Define a generative model for graphs (randomly grown graphs)
Define the distribution of naıve (initial) graphs P(G0)Define the transition distribution for our graphs, P(Gt |Gt−1)From this Markov process, we can compute the likelihood oftransitioning from any one graph to any other
Define a mapping from graphs to strings, and compute the Hammingdistance between the two strings
Uses the stationary distribution, π = Pπ, where P is the |V | × |V |matrix that defines the graphThis is very general, and could apply in theory to any graph
Project the graph onto a lower dimensional subspace, count thenumber of edges in each dimension, and then use the typical kNNclassifier in Rd
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 11 / 16
Theory
Building a universally consistent classifier on graphs
Three potential ways forward
Define a generative model for graphs (randomly grown graphs)
Define the distribution of naıve (initial) graphs P(G0)Define the transition distribution for our graphs, P(Gt |Gt−1)From this Markov process, we can compute the likelihood oftransitioning from any one graph to any other
Define a mapping from graphs to strings, and compute the Hammingdistance between the two strings
Uses the stationary distribution, π = Pπ, where P is the |V | × |V |matrix that defines the graph
This is very general, and could apply in theory to any graph
Project the graph onto a lower dimensional subspace, count thenumber of edges in each dimension, and then use the typical kNNclassifier in Rd
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 11 / 16
Theory
Building a universally consistent classifier on graphs
Three potential ways forward
Define a generative model for graphs (randomly grown graphs)
Define the distribution of naıve (initial) graphs P(G0)Define the transition distribution for our graphs, P(Gt |Gt−1)From this Markov process, we can compute the likelihood oftransitioning from any one graph to any other
Define a mapping from graphs to strings, and compute the Hammingdistance between the two strings
Uses the stationary distribution, π = Pπ, where P is the |V | × |V |matrix that defines the graphThis is very general, and could apply in theory to any graph
Project the graph onto a lower dimensional subspace, count thenumber of edges in each dimension, and then use the typical kNNclassifier in Rd
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 11 / 16
Theory
Building a universally consistent classifier on graphs
Three potential ways forward
Define a generative model for graphs (randomly grown graphs)
Define the distribution of naıve (initial) graphs P(G0)Define the transition distribution for our graphs, P(Gt |Gt−1)From this Markov process, we can compute the likelihood oftransitioning from any one graph to any other
Define a mapping from graphs to strings, and compute the Hammingdistance between the two strings
Uses the stationary distribution, π = Pπ, where P is the |V | × |V |matrix that defines the graphThis is very general, and could apply in theory to any graph
Project the graph onto a lower dimensional subspace, count thenumber of edges in each dimension, and then use the typical kNNclassifier in Rd
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 11 / 16
Applications
Outline
1 Introduction
2 Theory
3 Applications
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 12 / 16
Applications
Application to a small tractable system
Drosophila olfactory behavior
Determine the distribution of naıve drosophila, P(G0)
Train a group of drosophila on a simple olfactory task, i.e., flap wingsupon detecting a particular odor
Provide another (control) group with the same stimuli, but do notreward the appropriate behavior
Infer the graphs (or low dimensional representation of graphs) for thetwo populations
Given a new drosophila, classify based on her k nearest neighbors
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 13 / 16
Applications
Application to a small tractable system
Drosophila olfactory behavior
Determine the distribution of naıve drosophila, P(G0)
Train a group of drosophila on a simple olfactory task, i.e., flap wingsupon detecting a particular odor
Provide another (control) group with the same stimuli, but do notreward the appropriate behavior
Infer the graphs (or low dimensional representation of graphs) for thetwo populations
Given a new drosophila, classify based on her k nearest neighbors
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 13 / 16
Applications
Application to a small tractable system
Drosophila olfactory behavior
Determine the distribution of naıve drosophila, P(G0)
Train a group of drosophila on a simple olfactory task, i.e., flap wingsupon detecting a particular odor
Provide another (control) group with the same stimuli, but do notreward the appropriate behavior
Infer the graphs (or low dimensional representation of graphs) for thetwo populations
Given a new drosophila, classify based on her k nearest neighbors
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 13 / 16
Applications
Application to a small tractable system
Drosophila olfactory behavior
Determine the distribution of naıve drosophila, P(G0)
Train a group of drosophila on a simple olfactory task, i.e., flap wingsupon detecting a particular odor
Provide another (control) group with the same stimuli, but do notreward the appropriate behavior
Infer the graphs (or low dimensional representation of graphs) for thetwo populations
Given a new drosophila, classify based on her k nearest neighbors
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 13 / 16
Applications
Application to a small tractable system
Drosophila olfactory behavior
Determine the distribution of naıve drosophila, P(G0)
Train a group of drosophila on a simple olfactory task, i.e., flap wingsupon detecting a particular odor
Provide another (control) group with the same stimuli, but do notreward the appropriate behavior
Infer the graphs (or low dimensional representation of graphs) for thetwo populations
Given a new drosophila, classify based on her k nearest neighbors
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 13 / 16
Applications
Application to a human beings
Classifying piano virtuosos
Determine the distribution of naıve children, P(G0)
Infer the graph of adult piano virtuosos
Infer the graph of adult people with no formal musical training
Given a new human, classify based on her k nearest neighbors
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 14 / 16
Applications
Application to a human beings
Classifying piano virtuosos
Determine the distribution of naıve children, P(G0)
Infer the graph of adult piano virtuosos
Infer the graph of adult people with no formal musical training
Given a new human, classify based on her k nearest neighbors
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 14 / 16
Applications
Application to a human beings
Classifying piano virtuosos
Determine the distribution of naıve children, P(G0)
Infer the graph of adult piano virtuosos
Infer the graph of adult people with no formal musical training
Given a new human, classify based on her k nearest neighbors
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 14 / 16
Applications
Application to a human beings
Classifying piano virtuosos
Determine the distribution of naıve children, P(G0)
Infer the graph of adult piano virtuosos
Infer the graph of adult people with no formal musical training
Given a new human, classify based on her k nearest neighbors
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 14 / 16
Applications
More applications to human beings
Possible future queries
Where is Osama bin Laden hiding?
What is the combination?
How does this work?
Did you commit this crime?
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 15 / 16
Applications
More applications to human beings
Possible future queries
Where is Osama bin Laden hiding?
What is the combination?
How does this work?
Did you commit this crime?
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 15 / 16
Applications
More applications to human beings
Possible future queries
Where is Osama bin Laden hiding?
What is the combination?
How does this work?
Did you commit this crime?
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 15 / 16
Applications
More applications to human beings
Possible future queries
Where is Osama bin Laden hiding?
What is the combination?
How does this work?
Did you commit this crime?
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 15 / 16
Applications
Questions
Joshua Vogelstein (JHU) Neurocognitive Graph Theory March 3, 2009 16 / 16