4

Click here to load reader

Why Data Stream Management Systems Is a New Frontier for Social Media Marketing

Embed Size (px)

Citation preview

Page 1: Why Data Stream Management Systems Is a New Frontier for Social Media Marketing

SMM: A Data Stream Management System for Knowledge Discovery

1

Hetal Thakkar, Nikolay Laptev, Hamid Mousavi, Barzan Mozafari, Vincenzo Russo, Carlo Zaniolo

Computer Science Department UCLA

Page 2: Why Data Stream Management Systems Is a New Frontier for Social Media Marketing

Data Stream Management Systems (DSMS)

2

• DSMS critical in a variety of applicationso Click-stream analysis,o Algorithmic Tradingo Network monitoringo Credit card fraud detection …

• Many DSMS Projects and Prototypes :o STREAM (Stanford), Aurora/Borealis (Brown, MIT),

Telegraph (UCB), Gigascope (AT&T), Stream Mill (UCLA), … and so on.

• Commercial Startups and vendor extensions:o StreamBase, Aleri, Coral8, Apama, Truviso,o DBMS vendors …

• Support for online mining on data streams: unresolved issue for current systems.

Page 3: Why Data Stream Management Systems Is a New Frontier for Social Media Marketing

Two Main Research Challenges

3

• Challenge I: Fast and Light algorithms needed for online mining algorithms.

• Challenge II: These and business intelligence applications require the Quality of Service (QoS) of DSMS. Thus these algorithms must be deployed as part of a DSMS.

• Much research on first challenge—a stream of papers in DM conferences—but not on the second that is probably even harder.

Page 4: Why Data Stream Management Systems Is a New Frontier for Social Media Marketing

Data Stream Mining & DSMS QoS

4

• DSMS: Support continuous queries over massive data streams – with QoS (Quality of Service) guarantees and– (Quasi) Real-time response through:

o Scheduling, query optimization,o Windows and other Synopseso Load shedding …

• But - Current DSMS focus on simple continuous queries- Using query languages based on SQL- Lackluster history of SQL with KDD- DSMS bring more problems:

e.g. blocking queries not allowed.