24
Big Data Analytics Special Topics for Computer Science CSE 4095-001 CSE 5095-005 Fei Wang Associate Professor Department of Computer Science and Engineering [email protected] Feb 3

Big Data Analytics - School of Engineering · Big Data Analytics!! Special Topics for Computer Science CSE 4095-001 CSE 5095-005 Fei Wang ... cbc> a, a, a, b, b, c, d, d d . A

  • Upload
    buikhue

  • View
    220

  • Download
    0

Embed Size (px)

Citation preview

Big Data Analytics!!

Special Topics for Computer Science CSE 4095-001 CSE 5095-005

Fei Wang Associate Professor

Department of Computer Science and Engineering [email protected]

Feb 3

Feature Learning I

Features vs. Labels

Features: describing your data objects color, texture, taste… !Labels: discriminating your data objects apples, pears

Unsupervised Setting

Fruits

Cluster 1

Cluster 2

Supervised Setting

Apples

Pears

Apple!

Training

Testing

Semi-Supervised Setting

Transduction

Induction

Projection

Principle Component Analysis

Find a direction where the data have the largest variance

Principle Component Analysis

1. Compute data covariance matrix 2. Do eigenvalue decomposition on

the covariance matrix 3. Sort the eigenvalues from large

to small PC 1PC 2

Whitening Transform

Two Dimensional Principle Component Analysis

With Label Information

Linear Discriminant Analysis

Within class compactness

Between class Scatterness

Relevant Component Analysis

Relevant Component Analysis

Transactions vs. Sequences

Subsequence vs. Supersequence

Apriori Property

Frequent Itemset Mining

Frequent Itemset Mining

Sequential Pattern Extraction

Sequential Pattern Extraction

Sequential Pattern Extraction

Bag-of-Pattern Representation