Hive Regexserde Weblogs

Preview:

DESCRIPTION

Hive Regexserde Weblogs

Citation preview

  • Reference: http://stackoverflow.com/questions/9102184/regex-for-access-log-in-hive-serde

    CREATE EXTERNAL TABLE access_log ( `ip` STRING, `time_local` STRING, `method` STRING, `uri` STRING, `protocol` STRING, `status` STRING, `bytes_sent` STRING, `referer` STRING, `useragent` STRING ) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES ( 'input.regex'='^(\\S+) \\S+ \\S+ \\[([^\\[]+)\\] "(\\w+) (\\S+) (\\S+)" (\\d+) (\\d+) "([^"]+)" "([^"]+)".*')STORED AS TEXTFILELOCATION '/tmp/access_logs/';