Upload
mailru-group
View
8.592
Download
2
Embed Size (px)
Citation preview
Telstra KaggleCompetitionСтанислав Семёнов, 2016 г.
Задача
Метрика качества
Данные
Данные
Данные
Наблюдения
Наблюдения
“Magic Feature”
Кросс-валидация
5-Fold Stratified CV
Признаки
Severity:
1. Severity_type
Resource:
1. Resource min
2. Resource max
3. Resource count
Признаки
Event:
1. Event min
2. Event max
3. Event count
Log:
1. Log min
2. Log max
3. Log count
Признаки
Log:
1. Volume min
2. Volume max
3. Volume min / volume sum
4. Volume max / volume sum
5. Volume mean
6. Volume median
7. Volume sum
8. Volume std
Признаки
Order
1. Absolute order
2. Location
3. Order in location from the beginning
4. Order in location from the end
5. Size of location
6. Order in location from the beginning / Size of location
7. Order in location from the end / Size of location
Признаки
Location
1. Location id
One-Hot-Encode
1. Severity
2. Resource
3. Event
4. Log
5. Location
Признаки
TF-IDF
1. Resource
2. Event
3. Log
One-Hot-Encode sets for one id
1. Resource
2. Event
Признаки
2-way interactions
1. Event-event
2. Resource-resource
3. Event-resource
Group order
1. Severity
2. Event
3. Resource
4. Log
Признаки
Cumulative sum from the beginning
1. Severity
2. Resource
3. Event
Cumulative sum from the end
1. Severity
2. Resource
3. Event
Схема решения
Data
5-fold
5 XGBs2 ETs
Stacking
Blending
Result
Stacking-Blending
Результаты
Вопросы
Спасибо за внимание!