View
235
Download
0
Tags:
Embed Size (px)
Citation preview
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
Dr Uwe Aickelin
A Recommender System based on the Immune Network
• The Recommendation Problem
• The AIS Approach
• Algorithm Walkthrough
• Results and Discussion
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
The Recommendation Problem “What movies would you predict/recommend?”
Prediction
What rating would I give this film?
Prediction quality can be assessed by absolute error
Recommendation
Give me a ‘top 10’ list of films I might like
Recommendation quality can be assessed by a
ranking ‘discordance’ metric
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
vsInnateAcquire
d
vsCell Mediated
Humoral
T Cell (CD-4, Helper)Binds to MHC-antigen
complexSecretes cytokines to
help…
How do we protect the body against infection? (Antigens)
B CellSecretes
Antibody which binds to antigen
and recruits phagocytes (innate)
T Cell (CD-8, Killer)
Kills cell (viruses)
The Biological Immune System
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
EachMovie database User profiles (3M votes 70k users)
User Profile: set of tuples {movie, rating}
Me: My user profile
Neighbour: User profile of someone else
Similarity metric: Correlation score between user profiles
Neighbourhood: Group of neighbours similar to me
Recommendations: generated from neighbourhood
The Recommendation Problem
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
EachMovie database User profiles
User Profile: set of tuples {movie, rating}
Me: My user profile
Neighbour: User profile of someone else
Similarity metric: Correlation score between user profiles
Neighbourhood: Group of neighbours similar to me
Recommendations: generated from neighbourhood
The AIS Approach
Antigen
Antibody
Antibody – Antigen Binding Antibody – Antibody Binding
Group of antibodies similar to antigen and dissimilar to other antibodies
Stim
ulat
ion Suppression
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
Start with empty AIS
Encode target user as an antigen Ag
WHILE (AIS not full) && (More users) DO
Add next user as an antibody Ab
IF (AIS at full size)
Iterate AIS
FI
OD
Generate recommendations from AIS
The AIS Algorithm
Ab4
Ab1
Ab3
Ab2
Ag
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
Algorithm walkthrough: Encoding
DATABASE
u1={(m1,v11),(m2,v12),(m3,v13)}
u2={(m1,v21),(m2,v22),(m3,v23),(m4,v24)}
u3={(m1,v31),(m2,v32),(m4,v34)}
u4={(m1,v41),(m4,v44)}
u5={(m1,v51),(m2,v52),(m3,v53), (m4,v54)}
• We do not have user votes for every film
• We want to predict the vote of user u4 on movie m3
Suppose we have 5 users and 4 movies
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
Algorithm walkthrough (1)
Start with empty AIS
DATABASE
u1, u2, u3, u4, u5
AIS
Encode user for whom to make predictions as an antigen Ag
DATABASE
u1, u2, u3, u4, u5
u4AIS
Ag
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
Algorithm walkthrough (2)
Add antibodies until AIS is full…
Ab1
DATABASE
u1, u2, u3, u4, u5
u1
AIS
Ag
Add next user as an antibody Ab1
Add users 2 and 3 …
DATABASE
u1, u2, u3, u4, u5
u2,u
3
AIS
Ag
Ab1 Ab2
Ab3
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
Algorithm walkthrough (3)
After some more iterations… the AIS has filled up:Table of matching Scores between Ab and Ag
MS14, MS24, MS34
Table of matching Scores between Antibodies
MS12 = CorrelCoef(Ab1, Ab2)
MS13 = CorrelCoef(Ab1, Ab3)
MS23 = CorrelCoef(Ab2, Ab3)
Ab2
Ab3
Ab1
Ag
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
Algorithm walkthrough (4)
AIS is now at full size so begin iterations…
Ab1
Ab1 Ab2
Ab2
Ab2
Ab2
Ab2
Ag
Ab1 Ab2
Ab3
AIS
Ag
AIS
Notice that antibody 3 has been eliminated.
Calculate new CONCENTRATION for each Ab, considering interactions with Ag (STIMULATION) and other Ab (SUPPRESSION)
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
Algorithm walkthrough (5)
If AIS not yet full and more users available, repeat.
Otherwise: GENERATE RECOMMENDATION from CONCENTRATION and ANTIGEN Correlation.
Recommendation for user u4 on movie m3 will be highly based on vote on m3 of user u2
AIS
Ab1
Ab2
Ab1
AgAb2
Ab2
Ab2
Ab2
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
• Tested against EachMovie database (15000 users, 1628 films)
• Results compared to standard method (Pearson k-nearest neighbours)
• Prediction : Results of same quality
• Recommendation: Improved results, 4 out of 5 films correct versus 3 out of 5.
Results
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
1. Stimulation and suppression affect neighbourhood size and number of users
looked at
0
10
20
30
40
50
60
70
80
90
100
0 0.2 0.4 0.6 0.8 1
Stimulation Rate
Ne
igh
bo
urh
oo
d
0
5000
10000
15000
0 0.2 0.4 0.6 0.8 1
Stimulation Rate
# u
sers
0
10
20
30
40
50
60
70
80
90
100
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Suppression rate
Nei
gh
bo
urh
oo
d s
ize Rate 0.2
Rate 0.3
Rate 0.5
0
2000
4000
6000
8000
10000
12000
14000
16000
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Suppression Rate
Nu
mb
er r
evie
wer
s
Rate 0.2Rate 0.3Rate 0.5
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Stimulation Rate
Mean
Ab
so
lute
Err
or AIS (av)
SP (av)SP baseline
2. AIS matches Pearson for prediction
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
3. AIS surpasses Pearson for Recommendation
0.35
0.4
0.45
0.5
0.55
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Stimulation Rate
Re
co
mm
en
da
tio
n
Ac
cu
rac
y (
Ke
nd
all's
Ta
u)
AIS (av)
SP (av)
SP Baseline
80.0%
90.0%
100.0%
110.0%
120.0%
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Suppression rate
Re
lati
ve
Re
co
mm
en
da
tio
n
ac
cu
rac
y
Rate 0.2
Rate 0.3
Rate 0.5
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
General purpose recommendation tool (e.g. Bookmarks)
Collaborative Filtering is a useful vehicle for examination of AIS dynamics: - Idiotypic effect for more varied population - Potential for distribution - Smaller neighbourhoods (vs computational cost)
Wider applicability (e.g. online community formation)
Evaluation
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
Idiotypic effects alter nature of community
How important is diversity?
Are there other network effects that can be used? (hubs, routers etc)
Distribution: the snowball effect
What about interacting communities?
Application areas: ad-hoc community formation, knowledge management, P2P routing…
Speculation: online community formation
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
Change detection (Checksums)
‘Self’ : files, network traffic, system calls
Antibodies creation: positive vs negative selection
Collaboration between different populations/sites
Representation: binary string or symbolic (rules)
Other IS features:activation thresholds (vs false positives)co-stimulation (vs false positives)memory detectors (secondary response)MHC masks to cover ‘holes’ (similar to self)
AIS for Security
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
Example: Hofmeyr & Forrest 2000
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
Evaluation Applied to network intrusion, virus detection…
Good results on test systems
BUT…
Negative Selection doesn’t scale
Inefficient to map entire non self universe
Changes over time
Appropriate representation of self
Appropriate matching
Primary response requires infection?
AIS for Security
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
Traditional Self - Non Self Distinction
• An immune response is triggered when the body encounters something foreign.
• The difference between self and non-self is learnt early in life.
• E.g. eliminate those T- and B-cells that react to self.
• Problems:
• No reaction to foreign bacteria in gut
• No reaction to food we eat
• The human body changes over its life
• Auto-immune diseases
• Tumours / Transplants
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
The Danger Theory
• Need for discrimination: What should be responded to?
• Respond to Danger not to “foreignness”.
• No need to attack everything that is foreign.
• Danger is measured by damage / distress signals.
Advantages:
• Can take care of non-self but harmless
• Can take care of self but harmful
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
Danger Model Conclusions
• Self-Nonself discrimination still useful.
• Nonself does not cause immune response.
• Danger Signals trigger immune response.
• A question of semantics?
• Can this model help us build an AIS for security applications?
• What would be ‘danger signals’?
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
Discussion
Uwe Aickelin: http://www.aickelin.com/
Steve Cayzer: http://www-uk.hpl.hp.co.uk/people/steve_cayzer/
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
Additional Slides
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
AIS Models - Idiotypic
Antibody
Antigen
Antibody
Epitope
Paratope
Farmer et al 1986
• Paratope/Epitopes
Lock and Key
Interchangeable?
• Behaviour
Matching
Idiotypic (Memory, auto-immune)
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
Jerne’s Big Idea (1974)
Idiotype: specificity of antibody (epitopes to which it will bind)
Idiotope: An idiotypic epitope
Evidence: Antibodies produced againstantibodies of same species (cf individual)
Antigen
P1I1
P2 I2
Idiotypic Set
P3 I3
Anti-Idiotypic Set
Internal Image of Antigen
+-
AIS Models - Idiotypic
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
In Words…
The idiotypic network hypothesis (Jerne 1974) builds on the recognition that antibodies can match other antibodies as well as antigens. A group of antibodies, which match an antigen, may be matched by other antibodies which may in turn be matched by yet other antibodies. This stimulatory effect will set up activation chains or loops.Matched antibodies are suppressed, and this effect will encourage diversity
In Formulae…
AIS Models - Idiotypic
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
dt
dxi
recognised
antigens
dt
dxi
recognised
antibodies
recognised
antigens
dt
dxi
recognised
antigens
recognised
amI
recognised
antibodiesc
dt
dxi
recognised
amI
recognised
antibodies
recognised
antigens
dt
dxi
rate
death
recognised
antigens
recognised
amI
recognised
antibodiesc
dt
dxi
kjiij
i
n
jjiji
N
j
N
jjiijjiji
i
snpkneGm
xkyxmxxmkxxmc
rate
death
recognised
antigens
recognised
amI
recognised
antibodiesc
dt
dx
1
211 1
1 i
n
jjiji
N
j
N
jjiijjiji
i
xkyxmxxmkxxmc
rate
death
recognised
antigens
recognised
amI
recognised
antibodiesc
dt
dx
211 1
1
• For N antibodies, n antigens.
• xi is the concentration of antibody i
• p and e stand for ‘paratope’ and ‘epitope.’
s is the matching threshold.
G is a rectifier function which outputs 0 for all negative input.
k is the allowable overlap
AIS Models - Idiotypic
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
Simple user comparisons (Pearson, cosine, k-Nearest Neighbour) Problems: Sparsity, curse of dimensionality Memory vs Model based approaches Transformative and Transitive functions Default votes, Content based, Learning algorithms Challenge of distribution (vs centralization)
Recommendation Approaches
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
System Description: Encoding
nn scoreidscoreidscoreidUser ,...,,, 2211
Users are represented as a set of tuples which represent their votes:
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
We use the Pearson correlation measure
System Description: Matching
n
i
n
iii
n
iii
vvuu
vvuur
1 1
22
1
)(,
__,0
__,0
1 1
22
penaltyoverlapPwhererP
nrPnif
DEFAULTVARIANCEZEROrvvuuif
DEFAULTOVERLAPNOrnifn
i
n
iii
The measure is amended as follows
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
Parameters: Matching Function
Parameter Value Comments Minimum expected overlap
5 Minimum expected overlap between 2 users (used to calculate penalty)
Zero Variance Default
0.0 Correlation score when users have overlaps with zero variance
Use minimum expected overlap
True Use minimum overlap (i.e. penalise users with less than expected overlap)
No Overlap Default
0.0 Correlation score if 2 users have no overlapping items
Use mean of overlap
False Use mean of overlapping votes only for correlation (otherwise use mean over all votes of user)
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
Parameters: AIS
Parameter Value Comments Suppression Rate
0.001 Suppression constant (weighting on antibody-antibody suppression term)
Rate Constant 0.25 for single, 1 for idiotypic
Rate constant, applied to each match calculation. Between 0 and 1
Stimulation Rate 0 Stimulation constant (weighting on antibody-antibody stimulation term)
Death Rate 0.1 Death Rate of antibodies (ie % that dies off per unit time)
Maximum Concentration
100.0 Maximum concentration of antibody or antigen in this AIS
Minimum Concentration
0 Minimum concentration of antibody or antigen in this AIS
Initial Concentration
1.0 Initial concentration of antibody or antigen in this AIS
Use Concentration
True Should we use concentration to weight stimulation and suppression?
Use Absolute True Should we use absolute match score for weighting (hence negative correlations are treated as valuable)
Synchronous True Should we apply concentration changes synchronously (in batch)
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
We predict a rating by using a weighted average over the neighbourhood of a user:
System Description: Prediction
ionvotednothasvifVOTEDEFAULTv
absolutenotrelativenoterw
w
vvwup
i
uv
Nvuv
Nviuv
i
_
)(
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
Parameters: Prediction
Parameter Value Comments Cluster Size 50-100 Cluster size (AIS size) Should be >=
neighbourhood size Build From Scratch
BOTH Do we build AIS from scratch for each prediction or start with one massive AIS?
Use Default Votes
True Should we use a default vote for prediction purposes?
Neighbourhood size
30-50 Neighbourhood size (k-NN parameter). Should be <= cluster size.
Use Idiotypic AIS
BOTH Use idiotypic immune system (with antibody-antibody interactions)
Default Vote 2.0 Default vote (if used). Use Correlation True Use correlation scores to weight prediction Use category False Use category information to help make prediction Max Iterations 5 Max iterations with no change in AIS Use Concentration
False Weight prediction by antibody concentration (as well as correlation)
click to edit master text
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
Artificial Immune Systems
• Mean Absolute Error
System Description: Evaluation
n
predictedactualMAE n
Precision vs Recall
Variance 22
n
predictedactual
n
predictedactualnn
likeduserthatitemsU
tionsrecommendaofsetRwhereR
URP
likeduserthatitemsU
tionsrecommendaofsetRwhereU
URC