Handling dates with Regex in Apache Pig -


assuming field time looks 2013-01-01t00:00:00.000z , piggybank.jar has been imported , , command extract has been defined (define extract org.apache.pig.piggybank.evaluation.string.extract();) what's best way extract fields year, month, day, hour, minute, second ? that's have done far:

data = foreach data generate flatten(extract(time, '(\\d+)-(\\d+)-(\\d+)t(\\d+):(\\d+):(\\d+).(\\s+)'))         (             year: int,             month: int,             day: int,             hour: int,             minute: int,             second: int,             tail: chararray         ); 

since pig 0.11 can use datetime type.

a = load 'data' (date:chararray); b = foreach generate todate(date) date; c = foreach b generate getmonth(date) month; 

you can use these functions here: datetime functions

if you're not working 0.11 can write udf or resort regex posted.


Comments

Popular posts from this blog

Why does Ruby on Rails generate add a blank line to the end of a file? -

keyboard - Smiles and long press feature in Android -

node.js - Bad Request - node js ajax post -