java - How to read large file from Amazon S3? -
i have program read textfile amazon s3, file around 400m. have increased heap size i'm still getting java heap size error. so, i'm not sure if code correct or not. i'm using amazon sdk java , guava deal file stream.
please help
s3object object = s3client.getobject(new getobjectrequest(bucketname, folder + filename)); final inputstream objectdata = object.getobjectcontent(); inputsupplier supplier = charstreams.newreadersupplier(new inputsupplier() { @override public inputstream getinput() throws ioexception { return objectdata; } }, charsets.utf_8); string content = charstreams.tostring(supplier); objectdata.close(); return content; i use option jvm. -xms512m -xmx2g. use ant run main program include jvm option ant_opts well. it's still not working.
the point of inputsupplier -- though should using bytesource , charsource these days -- should never have access inputstream outside, don't have remember close or not.
if you're using old version of guava before bytesource , charsource introduced, should be
inputsupplier supplier = charstreams.newreadersupplier(new inputsupplier() { @override public inputstream getinput() throws ioexception { s3object object = s3client.getobject( new getobjectrequest(bucketname, folder + filename)); return object.getobjectcontent(); } }, charsets.utf_8); string content = charstreams.tostring(supplier); if you're using guava 14, can done more fluently as
new bytesource() { @override public inputstream openstream() throws ioexception { s3object object = s3client.getobject( new getobjectrequest(bucketname, folder + filename)); return object.getobjectcontent(); } }.ascharsource(charsets.utf_8).read(); that said: file might 400mb, java strings stored utf-16, can double memory consumption. may either need lots more memory, or need figure out way avoid keeping whole file in memory @ once.
Comments
Post a Comment