השלבים הראשונים של עיבוד מידע ראייתי: מודל עצבי

? ?

.

ICAIndependent Component Analysis

N , ( (

ICA Continued :

X1, X2 : .12. .3. ( ) .4. N (1,0)

ICA Continued ( , , ) .

Solving Problems Using ICA

Deciding Who are the sources and how to derive the output Step 1

In our case each neuron in the input layer is a pixel in the pictureEach neuron in the output receives all the pixels of the picture.

Overcomplete representationInput from the retinaThe visual cortexS can be viewed as the independent sources that cause the reaction on the retina. X is that reaction and S is a reconstruction of the sources that caused this reaction

Natural scene12 X 12 patch

Column j is all of the pixel values of the j patch. Number of columns equals the number of patches, i.e. 15000 in our case.Row i contains the values of pixel i in each patch. Number of rows equals to patch size, i.e. 144 in our case.

Deriving the outputThe function above calculates the output of neuron i when presented with patch k.

The input output function of each neuron is a non linear function Squashing the output sensitivity, to the range of input values, into a plausible biological behavior.

Column j represent M (number of output neurons) weights from the j input neuronRow i represents N (number of input neurons) weights on the i output neuron.

Overview to so far

Step 2Preprocess the Data

Why do we preprocess the data ?Adhering to the conditions of the strong central theorem, if not fully then at least some of them. Reducing the dimensions those serving two purposes : 1. Easier to compute2. We can decide the network input and output size.

Methods of preprocessing Centering (reducing the mean from the data)

PCA

At least one condition is met. Also calculating the correlation matrix ,between the pixels, becomes easier

After PreprocessingData is called whitened.The mean of the data is zero.The Pixels have no correlation.We reduced the data dimension from 144 to 100.In order to reconstruct filters later, some processing is done.

Step 3Cost function and the learning rule

ICA Again ICA can be implemented in several ways. The main difference between the ways are the method that is being used to estimate how much is the output distribution normal.

Infomax approachThe purpose of the learning process is to improve the information representation in the output layer

Using Mathematical methods taken from information theory, information can be quantified, and algorithms aimed to improve the NN information representation, through changing the weights, could be developed. We assume that the brain is using similar methods to better represent information

Three Important Information theory quantities Entropy the mean value of a given random variable informationMutual Information The amount of uncertainty of a given variable Y which is resolved by observing X.Information defined as -log(p(x)). The more rare the appearance of a given value the more information it carries.

Infomax basic assumptionsMinimize mutual information between neurons of the output layer

Maximize mutual information between the input and the output

The noise level is fixed, so its effect on the entropy of the output layerAssumptionIntuition activity of one neuron of the output layer shouldnt give information about the activity of the other neurons

different reaction in the output layer for each input pattern and consentience for the pattern only the entropy of the output plays an effect on the total MI between the layers

I\O layers Mutual informationMutual information between the input and the output layers depends on the entropy of the output layer, because s value is a function of x.

We Want to Maximize H(s)

Output Layer Mutual informationFrom the equation above (although for discrete values) we can see that if the neurons are statistically independent then the log in the expression becomes zero and the mutual information is zero

We want to minimize MI in output layer

Estimating output distributionAfter long and painful math we derive the mentioned above expression as the estimation to s distribution.Chi is called the susceptibility matrix.

Entropy of output estimation because we dont use the explicit equation of s entropyFor the same reason as previously mentionedEstimation of P(s)The integral solution. H(x) is considered as zero or constant

The cost function

Sum of each changes in Si in power of 2 according to the input.The minus sign takes care for the value of the error to the decrease as the value on the right increases

Geometrical Interpretation of the Cost functionGeometrically, the target is to maximize the volume change in the transformation. This improves discrimination. And increases the mutual information between the input and the output

Learning RulesUsing gradient descent method we define learning rules (how to change the W in response for a given set of outputs) . The rate of learningDerivatives of the cost function by W

Step 4Writing the simulation in matlab and getting results

angle (Rad)Number Of cells

? . . , "". , : , .

* .* , .

When we will discover the laws underlying natural computation . . .. . . we will finally understand the nature of computation itself.--J. von Neumann

" ...... ".--J. von Neumann

, , . : (), - . . , . " .. - , . , .(' ' ) . ( ) LGN(. V1 ; ( ) LGN(. V1 ;

(1V) , . , . " .

. . , , ( -), . , . : (ON center), (OFF center). -, .

40 , .

[OnStart -> forest image plus TitleOur visual system has evolved to pick up relevant visual information quikcly and efficiently, but with minimal assumptions about exactly what we see (since this could lead to frequent hallucinations!).

[ENTER 2nd forest image] An important assumption that visual systems can make is that the statistical nature of natural scenes is fairly stable. How can a visual system, either natural or synthetic, extract maximum information from a visual scene most efficiently?

[ENTER] Infomax ICA, developed under ONR funding by Bell and Sejnowski, does just this. Infomax ICA is a neural network approach to blind signal processing that seeks to maximize the total information (in Shannons sense) in its output channels, given its input. This is equivalent to minimizing the mutual information contained in pairs of outputs.

Applied to image patches from natural scenes like these by Tony Bell and others, [ENTER] ICA derives maximally informative sets of visual patch filters that strongly resemble the receptive fields of primary visual neurons. [ENTER] 1V

Documents

השלבים הראשונים של עיבוד מידע ראייתי: מודל עצבי