View
216
Download
1
Category
Tags:
Preview:
Citation preview
Intelligent Fril/SQL Interrogator
Dong (Walter) Xie
Supervisor : J. F. Baldwin
2003 - 2004
Why We need Computers in Business
To Store, Retrieve, Analyse and Integrate Information Database system, Information system Data mining, Machine learning, Fuzzy logic system Procedural programming language, Logic programming language
Never Tired, Never Complain, Less Mistake, Less Paid To Communicate in The World
Email, Internet phone, Internet fax
Some of Popular Systems in Business BI (Business Intelligence) ERP (Enterprise Resource Planning) CRM (Customer Relationship Management)
SAP, Oracle, Sagent, Informatica, SAS, SPSS, Business Objects, Cognos, IBM … …
Intelligent Fril/SQL Interrogator
Powerful Query System Standard SQL query Fril query — querying the facts and rules to give the answer with support pair Natural language query
Fril Support Logic Programming + Knowledge Bases + Fuzzy Logic System (Graphic Fuzzy Sets Interface)
Fuzzy Logic Data Mining & Machine Learning Toolkit Object Oriented
Reusable, structural, adaptable, information hiding, reliable
Fril (Fuzzy Relation Inference Language) & its Resources
Fril is a support logic programming language which includes Prolog as a subset of the language, but which also allows probabilistic uncertainties and fuzzy sets to be included. Fril and its applications are described in detail in “Fril – Fuzzy and Evidential Reasoning in Artificial Intelligence”, J. F. Baldwin, T. P. Martin and B. W. Pilsworth. Trevor Martin’s homepage http://www.enm.bris.ac.uk/ai/martin/downloads/FrilResources.html
http://www.enm.bris.ac.uk/ai/fril.html
Intelligent Fril/SQL Interrogator Flow Graph
… ….
Database tables & Output tables
ODBC
JDBC
Database drivers
User
IntelligentFril/SQL Interrogat
or
CFril
JFril
Knowledge Base
Fuzzy Logic Data Mining Toolkit
Fril
Fril interface
Natural Languageinterface
Structure of Objects
1 Database tables2 Output tables from other objects3 Knowledge bases from other objects
Slot 1 SQL query using tables in the inputs list
Slot 3 Fril wh query given by rule with result as head
Inputs List
Processed a temporary database TD
Fril wh query to provide smart answers
Slot 2 or 3 can be empty. Temporary database TD will be passed down from slot 1 to the last slot in sequence, and training set of slot 2 automatically formed from TD Output of the object comes out from the last slot.
Output of one object can link to other objects as input
Slot 2 Machine learning toolkit to generate Fril knowledge base, or Fril method written by user
Output List
either 1 Output tableor 2 Knowledge base
Provided Fril method or Fril knowledge base for Fril queries
Fuzzy Partition List
FP1FP2
Object Slots
Object Inputs
Output Table
1 SQL Slot
3 Fril Query Slot
2 Knowledge Base Slot
Machine Learning Toolkit
Customized by User
Knowledge Base
2 & 3 empty
2 empty
If 3 empty, the object outputs knowledge base; if 3 not empty, it outputs output table.
Fril Knowledge Base in Slot 2
Generated automatically by machine learning toolkit IF… THEN… fuzzy logic rules, fuzzy ID3 decision tree, Bayesian network, fuzzy association rules or prototypes The training set is automatically provided by the result set of SQL query in slot 2 The knowledge base can be edited by user
Customized by user Fril method written by user Some theorem in Fril format
Fuzzy Logic Machine Learning Toolkit
Original Database
Effective Reduced Database
Fuzzy ID3 Decision Tree
Fuzzy ID3 Decision Tree
Fril general rule
Fril general rule
Fril general rule
Fuzzy clusters
Fuzzy Logic rules
Bayesian Net
prototypes
Evidential logic rules
Fuzzy clusters
Fuzzy Logic rules
Fuzzy Association rules
Temporary Fril Knowledge BaseBefore using Fril query, we first extract data from temporary database from SQL query to form a temporary Fril knowledge base of clauses.
For example:temp =
name wt height
John 144 20
Jill 156 22
Pat 120 18
Bill 153 21
((temp John 144 20))((temp Jill 156 22))((temp Pat 120 18))((temp Bill 153 21))
We obtain Fril knowledge base
The Fril query then operatesusing this temporary knowledge baseResult of query put into output table
Fuzzy Partitions in the Object
Simple Fuzzy Partition :
A simple fuzzy partition {fi} is a set of triangular or
trapezoidal fuzzy sets such that
for any data point x X where X is the universal set.
1)x( i
fi
E q ua l s p ac e fuzzy s e ts
...
n 2 1
a b
... 3 F u zz y s e t
1-n
a-ba
1-n
a)-2)(b-(na
E q u a l d a ta p o in ts f u z z y s e ts
...
n 2 1
a b
... 3 F u zz y s e t
)1-n
2)m-(n( a val)
1-n
m( a val
m / (n - 1 ) m / (n - 1 ) m / (n - 1 ) m / (n - 1 ) N u m b e r o f i n s ta n c e s
Fuzzy partitions listed in the object are used to explain fuzzy sets in Fril method or answer Fril query
Fril Query in Slot 3 & its Result
Fuzzy Set(short [0:1, 2:0.7, 3:0])
Temporary Knowledge Base((case 1 a 2))((case 2 b 1))((case 3 c 3))
Fril Query in Slot 3(X (findall (T SUPPORT) ((supp_query ((case R T short)) (SUPPORT P)) (less 0 SUPPORT)) X))
Solution in Fril format((a 0.7) (b 0.85))
Transferred to output table as
T SUPPORT
ab
0.70.85
Object’s Output Table
Note: Fril query in slot 3 is equivalent to a wh query, but word wh has been omitted
Linking Objects
Output table 1Database 1
SQL 1
Fril Query 1
Object 1
Output table 2
SQL 2
Fril Query 2
Object 2
Output table 4
SQL 4
Fril Query 4
Object 4
Knowledge base 3
SQL 3
Object 3
Machine learning toolkit
Database 2
Applications in Business and Commerce
It can be either integrated with other system AI part of these systems, such as BI, ERP, CRM, or E-Commerce … …
Or can be used alone Supermarket Basket Analysis Risk Analysis (Foreign Trade)
Fraud Detection (APACS London reported £373.7 million losses through credit card fraud in the 12 months ending August 2001)
Advertisement Analysis Commodities, Currencies, Stocks & Futures Market Analysis
Other Applications
Engineering Intelligent CAD and CAM Engineering design … …
Science Intelligent personal identification library (face, fingerprint, DNA) Flood or earthquake prediction Virtual chemical combination
Medicine, Biology and Genetics Intelligent gene library Diagnostic expert system
Supermarket Basket Example Database
Customer table contains the personal details of customers
Transaction table contains supermarket basket scanner panel data
Product table contains the commercial and nutritional information published in web
Product Table & Fuzzy Partition of Sugar Content
UPC (universal product code)
ProductName
Fuzzy set defined by manager or user
Price Vegetarian Energy Protein Sugar …
Product Table (nutrition)
Crispy Chilli Beef
£ 2.96(unit)
2014730 No(Yes / No)
23.8 (g per 100g)
11.9 (g per 100g)
318(kcal per
100g)
Sainsbury's online shop : http://www.sainsburystoyou.com/arriving/login.jsp
29.2g (per 100g)0
1low
Fuzzy Sets of Sugar Content
medium high
Example of Linking Objects
Output table 2Transaction
SQL 2
Object 2
Output table 3Transaction
SQL 3
Object 3
Output table 1Product
SQL 1
Fril query 1
Object 1
FP1
Object 1
Object 1
Customized rules
Knowledge base 5
Customer
SQL 5
Object 5
Output table 4Transaction
SQL 4
Object 4
Output table 7Customer
SQL 7
Fril query 7
Object 7
Output table 6
SQL 6
Fril query 6
Object 6
Object 4
Machine learning toolkit
Object 4
CustomerObject 4Object 5
Customized knowledge
Object1 — Support of Low Sugar Content
List the items whose membership of that sugar content being low
UPC, ProductName, Price, Sugar, Support
Product
SELECT UPC, ProductName, Price, Sugar FROM Product ORDER BY UPC
gives
UPC
in Output Table(X (findall (UPC PRODUCTNAME PRICE SUGAR SUPPORT) ((supp_query ((case R UPC PRODUCTNAME PRICE object1_sugar_low)(case R UPC PRODUCTNAME PRICE SUGAR)) (SUPPORT P)) (less 0 SUPPORT)) X))
…object1_sugar
Object2 — Total Price of Low Sugar Content Food
Items, SaleTotalPrice, LabelTotalPrice, Support
SELECT Sum(Transaction.ItemQuantity) AS NumOfItems,
Sum(Transaction.TotalPriceOfTheItem) AS SaleTotalPrice,
Sum(Transaction.ItemQuantity*FocusOutputTable_1.Price) AS LabelTotalPrice,
FocusOutputTable_1.Support FROM Transaction INNER JOIN
FocusOutputTable_1 ON Transaction.UPC=FocusOutputTable_1.UPC WHERE
Transaction.Date<#10/10/2004# And Transaction.Date>#1/10/2004# GROUP BY
FocusOutputTable_1.Support ORDER BY FocusOutputTable_1.Support
Object2 sums up the number of items sold from 01/10/2004 to 10/10/2004, their total price and the total price originally labelled according to the distribution of support of Object1, where the support is the membership of that sugar content being low
UPC, ProductName, Price, Sugar, Support
Product
SELECT UPC, ProductName, Price, Sugar FROM Product ORDER BY UPC
(X (findall (UPC PRODUCTNAME PRICE SUGAR SUPPORT) ((supp_query ((case R UPC PRODUCTNAME PRICE object1_sugar_low)(case R UPC PRODUCTNAME PRICE SUGAR)) (SUPPORT P)) (less 0 SUPPORT)) X))
Object1 Object2
Transaction, Object1
Chart of Object2 Output Table (1)
NumOfItems
SaleTotalPrice LabelTotalPriceSuppor
t
1 2.37 2.96 0.1849
11 16.39 16.39 0.7979
2 3.78 5.78 0.8493
1 0.17 0.27 0.9931
11 62.25 70.71 0.9966
1 1.09 1.68 1
Object2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.17 1.09 2.37 3.78 16.4 62.3
Total Price Sold (£)
Supp
ort
Support – Total Price Bar Chart & Fuzzy Set Tendency Line
Based on Low Sugar Content Fuzzy Set
Chart of Object2 Output Table (2)
NumOfItems
SaleTotalPrice
LabelTotalPrice
Support
1 2.37 2.96 0.1849
11 16.39 16.39 0.7979
2 3.78 5.78 0.8493
1 0.17 0.27 0.9931
11 62.25 70.71 0.9966
1 1.09 1.68 1
Object2 Total Price – Support Bar Chart & their Tendency Lines
Based on Low Sugar Content Fuzzy Set
1 0 0
1115
0
10
20
30
40
50
60
70
80
0-0.2 0.2-0.4 0.4-0.6 0.6-0.8 0.8-1
Support
Tot
al P
rice
(£)
Number of Items
Total Price Sold
Total Price Labelled
Poly. (Total PriceSold)Poly. (Total PriceLabelled)Poly. (Number ofItems)
Object3 — Marketing Segmentation or Customer Personalization
Customer Informaton, Items, SaleTotalPrice, LabelTotalPrice, Support
SELECT Customer.CardNumber, Customer.FirstName, Customer.Surname, Sum(Transaction.ItemQuantity) AS NumOfItems, Sum(Transaction.TotalPriceOfTheItem) AS SaleTotalPrice, Sum(Transaction.ItemQuantity*FocusOutputTable_1.Price) AS LabelTotalPrice, FocusOutputTable_1.Support FROM Customer INNER JOIN (Transaction INNER JOIN FocusOutputTable_1 ON Transaction.UPC = FocusOutputTable_1.UPC) ON Customer.CardNumber = Transaction.CardNumber GROUP BY FocusOutputTable_1.Support, Customer.CardNumber, Customer.FirstName, Customer.Surname ORDER BY Customer.FirstName, Customer.Surname
If we constrain SQL in Object3 to search an individual or a group of customers with personal information, such as name, age, post code, etc., we are able to find customers’ favourites by individual or groups.
UPC, ProductName, Price, Sugar, Support
Product
SELECT UPC, ProductName, Price, Sugar FROM Product ORDER BY UPC
(X (findall (UPC PRODUCTNAME PRICE SUGAR SUPPORT) ((supp_query ((case R UPC PRODUCTNAME PRICE object1_sugar_low)(case R UPC PRODUCTNAME PRICE SUGAR)) (SUPPORT P)) (less 0 SUPPORT)) X))
Object1 Object3
Transaction, Object1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.23 2.52 2.81 3.78
Total Price Sold (£)
Supp
ort
Chart of Object3 Output Table (1)
NumOfItems
SaleTotalPrice
LabelTotalPrice
Support
1 1.23 2.72 0.0411
1 2.81 2.96 0.1849
2 3.78 5.78 0.8493
2 2.52 3.7 1
Object3
Based on Low Sugar Content Fuzzy Set
Customer name = “Dong Xie” Support – Total Price Bar Chart
& Fuzzy Set Tendency Line
2
0 0 0
4
0
1
2
3
4
5
6
7
8
9
10
0-0.2 0.2-0.4 0.4-0.6 0.6-0.8 0.8-1
Support
Tot
al P
rice
(£)
Number of Items
Total Price Sold
Total Price Labelled
Poly. (Total PriceSold)Poly. (Total PriceLabelled)Poly. (Number ofItems)
Chart of Object3 Output Table (2)
NumOfItems
SaleTotalPrice
LabelTotalPrice
Support
1 1.23 2.72 0.0411
1 2.81 2.96 0.1849
2 3.78 5.78 0.8493
2 2.52 3.7 1
Object3
Based on Low Sugar Content Fuzzy Set
Customer name = “Dong Xie” Total Price – Support Bar Chart
& their Tendency Lines
Marketing Segmentation or Customer Personalization Report
To analyse customer behaviour and favourite, we plot ordering support graphs in number of items, total price, …, with respect to the energy, sugar content, fat content, …, and price, discount, …, etc.
Fat PriceEnergy Protein Sugar …
low
medium
high
0
2
4
6
8
10
12
0-0.2 0.2-0.4 0.4-0.6 0.6-0.8 0.8-1
0
2
4
6
8
10
12
0-0.2 0.2-0.4 0.4-0.6 0.6-0.8 0.8-1
…
… … ……
Fuzzy Set
Report an individual customer OR a certain group of customers
2
0 0 0
4
0
2
4
6
8
10
12
0-0.2 0.2-0.4 0.4-0.6 0.6-0.8 0.8-1
Object5 — Customers’ Shopping Frequency & Average Spending
Transaction ID Date Card Number UPC Item Quantity Total Price
SELECT DISTINCT DateOfTrans, CardNumber, SUM(TotalPriceOfTheItem) AS TotalPrice FROM Transaction GROUP BY DateOfTrans, CardNumber ORDER BY DateOfTrans
Transaction Table (basket scanner data)
Name Shopping Frequency Average Spending Customer Rank
= Price Item
Object4
Object8 (= Object5 SQL)SELECT Count(FocusOutputTable_4.CardNumber) AS ShoppingFreq, Avg(FocusOutputTable_4.TotalPrice) AS AvgSpending, Customer.FirstName, Customer.Surname, Customer.Rank FROM Customer INNER JOIN FocusOutputTable_4 ON Customer.CardNumber=FocusOutputTable_4.CardNumber WHERE Customer.Rank > 0 GROUP BY Customer.FirstName, Customer.Surname, Customer.Rank
Customer rank can be adjusted by manager or analyser
Object5 — Knowledge Base Learned by Data Mining Toolkit
Fuzzy Logic Data Mining Toolkit
Training Set (SQL Result)
Fuzzy Sets
General Fril Rule
Simple Fuzzy Logic Rule
Fuzzy Decision Tree
IF … THEN … rule
The output of Object5 is the knowledge base, which can be linked into the input list of other objects
The Best Customers Definition from Knowledge Base
1 X is RarelyRarely (rank_fuzzy_set_1) Best Customer IF his/her Shopping Frequency is RareRare (shoppingfreq_fuzzy_set_1)
2 X is HighlyHighly (rank_fuzzy_set_3) Best Customer IF his/her Shopping Frequency is HighHigh (shoppingfreq_fuzzy_set_3)
3 X is FairlyFairly (rank_fuzzy_set_2) Best Customer IF his/her Shopping Frequency is FairFair (shoppingfreq_fuzzy_set_2)
Object5’s knowledge base (learned by fuzzy logic data mining toolkit)
Object7’s knowledge base (customized by user)
Object6 & Object7 — To Infer the Rank of Each New Customer
Form temporary knowledge basewhich is passed to Fril
Fril query to infer the rank of each new customer
Who are the best customers?
Object6 Load knowledge base in Fril, before to process slots[1] Customer Table
[2] Object4[3] Object5 Knowledge BaseSELECT Count(FocusOutputTable_4.CardNumber) AS ShoppingFreq, Avg(FocusOutputTable_4.TotalPrice) AS AvgSpending, Customer.FirstName, Customer.Surname, Customer.Rank FROM Customer INNER JOIN FocusOutputTable_4 ON Customer.CardNumber=FocusOutputTable_4.CardNumber WHERE Customer.Rank is null GROUP BY Customer.FirstName, Customer.Surname, Customer.Rank
(X (findall (FIRSTNAME SURNAME NEW_RANK) ((supp_query ((rank is rank_fuzzy_set_2 R)(case R SHF ASP FIRSTNAME SURNAME RANK)) (NEW_RANK P)) (less 0 NEW_RANK)) X))
FirstName, Surname, New_Rank
((term R SHF ASP)(case R SHF ASP FIRSTNAME SURNAME RANK))
Fril program to infer the rank of each new customer
Object7 — Knowledge Base Customized by User
[1] Customer Table[2] Object4
SELECT Count(FocusOutputTable_4.CardNumber) AS ShoppingFreq, Avg(FocusOutputTable_4.TotalPrice) AS AvgSpending, Customer.FirstName, Customer.Surname, Customer.Rank FROM Customer INNER JOIN FocusOutputTable_4 ON Customer.CardNumber=FocusOutputTable_4.CardNumber WHERE Customer.Rank is null GROUP BY Customer.FirstName, Customer.Surname, Customer.Rank
Fril program written by user
Knowledge base customized by user provides flexibility of system, which can be rules in Fril format or theorem (Fril program).
(X (findall (FIRSTNAME SURNAME NEW_RANK) ((supp_query ((rank is rank_fuzzy_set_2 R)(case R SHF ASP FIRSTNAME SURNAME RANK)) (NEW_RANK P)) (less 0 NEW_RANK)) X))
FirstName, Surname, New_Rank
Object7
Fuzzy logic rules defined by experienced expert
(avgspending_fuzzy_set_2 [ 47.108 : 0 99.497 : 1 ] )
((rank is rank_fuzzy_set_2 R)(term R X1 avgspending_fuzzy_set_2))
((term R SHF ASP)(case R SHF ASP FIRSTNAME SURNAME RANK))
A Simple Flight Routes Example
343 km 874 km
Edinburgh
Paris
Hongkong
Beijing
London Frankfurt Amsterdam
535 km
8238 km 7811 km 8161 km
9740 km
2018 km
660 km
7844 km
1024 km
In the example of fight routes query system, we design a logic object shown in the next slide to list all possible routes from Edinburgh to Beijing, their total distances and total number of stops associated.
Object 1 — Flight Routes Query System
Object1
[1] Flight
SELECT DISTINCT DepartureAirport, Destination, Distance FROM Flight
(X (findall (ROUTE_LIST TOTAL_DISTANCE TOTAL_STOP) ((travel Edinburgh Beijing 0 TOTAL_DISTANCE 0 TOTAL_STOP ROUTE_LIST)) X))
Route_List, Total_Distance, Total_Stop
((travel DEPAR DEST S1 TOTAL_DIS TOTAL_STOP TOTAL_STOP (DEPAR | (DEST | ( )))) (case INDEX DEPAR DEST DIST)(sum S1 DIST TOTAL_DIS))
((travel DEPAR DEST S1 TOTAL_DIS S2 TOTAL_STOP (DEPAR | ROUTE_LIST)) (case INDEX DEPAR Z DIST)(sum S1 DIST SD)(sum S2 1 ST)(travel Z DEST SD TOTAL_DIS ST TOTAL_STOP ROUTE_LIST))
traveltravel is a recursive definition in Fril containing 7 parameters.
The first traveltravel is the terminating condition of the recursion.
Object 1 Output Table
It is obviously that using conventional programming languages, such as C++, Java, Fortran, etc, has to cost more code than the two-line simple code in last slide. Nor SQL can implement this example just using two-line simple code. This is one of the big advantages to use Fril compared with other programming languages.
Product Recommendation System Example
Energy Production Vitamin Fat Salt Sugar Fibre
Bas
kets
Find overlapping clusters and name each cluster e.g. healthy eater,junk eater, etc. A point will have membership in each cluster
Cost
Each cluster can be represented by a fuzzy prototype.
… …
The Hybrid Fuzzy Expert System
New Basket
FilterSelect useful
pattern
Delete incompatible
pattern
Healthy Eater
Greedy Buyer
Vegetarian
… …
0.5
0.2
0.8
Classify new customer
Association rules
associated with
each fuzzy cluster
… …
List of recommendations
with rating
.
.
.
Semantic Distance
The products already in the basket will not appear in the recommendation list
Fuzzy Prototypes
The Fuzzy Prototype Model
A fuzzy prototype is a fuzzy model of a cluster providing a description of the relevant properties of the data in the cluster. The fuzzy model can be a fuzzy decision tree, simple fuzzy logic rules, etc. Therefore, the fuzzy prototype model is the collection of fuzzy prototypes formed from the data set.
Semantic Coordinate and Semantic Distance
If the semantic distance between two semantic coordinates of objects is smaller than a threshold SD < ε, then we consider that these two objects have the semantic identity (SD = 0).
Linking Objects For The Fuzzy Prototype Model
Linking Objects For Semantic Distance
Papers Related with Machine Learning Toolkit
J. F. Baldwin, Dong (Walter) Xie, “Simple Fuzzy Logic Rules based on Fuzzy Decision Tree for Classification and Prediction Problem”, Intelligent Information Processing II, Published by Springer, ISBN 0-387-23151-X (HC), Page 175, October, 2004.
J. F. Baldwin, Dong (Walter) Xie, “Fuzzy Association Rules discovered on Effective Reduced Database Algorithm”, Fuzz-IEEE 2005, Reno.
D. Xie and J. F. Baldwin, “Fuzzy prototype model and semantic distance”, Intelligent Systems, Special Issue, 2006.
D. Xie and J. F. Baldwin, “Intelligent fril/sql interrogator”, International Journal of Intelligent Systems, in review.
Contact Detail
Dong Xie (Walter) 谢冬
http://eis.bris.ac.uk/~enxdx or google search “Dong Xie”
D.Xie@bristol.ac.uk (English)Xiedong75@hotmail.com (中文 )
Professor Jim F. Baldwinhttp://www.enm.bris.ac.uk/ai/enjfb.html
Recommended