Upload
honeydewaccount
View
134
Download
0
Tags:
Embed Size (px)
Citation preview
Honeydew Progress
Honeydew Team (3.30)
What we have done last week
• Implement Time Prediction– TimePropertyClassifier– TimePropertyGenerator
• Debug
• Some running results
• Slightly modify test cases– dummyclassifier classifier– add some boundary tests
Modifications• 1. generateTimeProperty is realized as Honeydewatime constructor in
HoneydewTime.java \• where the input arguments are formatedTime• 2. Add HoneydewTimeTest.java to test functions in HoneydewTime• 3. Delete getBestHour, getBestMinute and getBestSlot from
ITimePropertyGenerator.java• Since these three functions are useful only in test cases. • At this point, ITimePropertyGenerator is more suitable to be an interface rather than
an abastract class• 4. Modify getBestTime() return type to String formatted as "hh:mm AM"• 5. Add getBestHour, getBestMinute and getBestSlot to TimePropertyGenerator with
String the return type to test• 6. Implement functions in TimePropertyGenerator.java • 7. HOUR is modified to HOUR_VALUE, making it distinct from HOUR in
PROPERTY_NAMES
Running results
Running results
Running results
Next week plan
• Experiments
• Analysis on results
• Refactoring
Tickets
Experiment Result of TEA
TEA Evaluation Wrapper
• TEA is stateful– Extract sent time as the reference time– Use current time if no sent time were found
Evaluation Heuristics
• Heuristic I: pick the earliest time
• Heuristic II: pick a random time from the recognized time expressions
• Heuristic III: perfect oracle (upper bound). Check each recognized time expression against the ground truth
Email Corpus
• 172 emails from last year’s experiment
• 320 more emails
• Manually went through all the emails, and labeled 215 of them
Experiment Setup
• Skip emails that cause segmentation fault on the TEA package– TEA ran successfully on 165 emails (no
segmentation fault)
• Report accuracy (#emails correctly labeled by TEA/#total emails)
Experimental Result
Heuristic I Heuristic II Heuristic III
#emails correctly labeled by TEA
19 30.33 80
Accuracy (%) 11.5152 18.3818 48.4848
Result of heuristic II is 100-run average
Correct Predictions (Heuristic I) • prediction = 20070803T1100??, label = 20070803T110000• prediction = 20070803T??????, label = 20070803T100000• prediction = 20050418T??????, label = 20050418T154500• prediction = 20050404T1100??, label = 20050404T110000• prediction = 20050418T??????, label = 20050418T120000• prediction = 2007????T??????, label = 20070814T090000• prediction = 2007????T??????, label = 20070814T090000• prediction = 20071119T??????, label = 20071119T113000• prediction = 20050427T0930??, label = 20050427T093000
Correct Predictions (Heuristic I)• prediction = 20080319T??????, label = 20080319T160000• prediction = 20050524T??????, label = 20050524T140000• prediction = 20050511T??????, label = 20050511T103000• prediction = 20050428T??????, label = 20050428T151500• prediction = 20080229T??????, label = 20080229T1330??• prediction = 20050512T??????, label = 20050512T0900??• prediction = 2007????T??????, label = 20070814T090000• prediction = 20070905T1100??, label = 20070905T110000• prediction = 20080306T1000??, label = 20080306T1000??• prediction = 20070830T1100??, label = 20070830T1100??
Error Analysis
• 8 types of errors:– No time expression extracted– Wrong year– Wrong month– Wrong day– Wrong hour– Wrong minute– Wrong second– Misc
Heuristic I
Type 1 (none)
2(year)
3(month)
4(day)
5(hour)
6(minute)
7(second)
8(misc)
# 33 19 31 97 42 36 1 0
More Details (heuristic I)
• Only 1 error from type 7 (second):– 20050517t155632 20050521t120000
• Zero error from type 8 (misc):– Other 7 types cover all cases
• One instance may contribute errors to more than one types
Error Analysis for Heuristic III
• For each meeting email– If one of the extracted time expression
matched the time label, count it as correct– If no matches are found, analyze each
extracted time expression to obtain error statistics
Heuristic III
Type 1 (none)
2(year)
3(month)
4(day)
5(hour)
6(minute)
7(second)
8(misc)
# 33 137 147 267 276 235 95 0
Case Closed for TEA
• Questions?
Thank You