Fastest way to get histogram of array sizes using MongoDB aggregation framework -


i'm trying list of number of records have arrays of varying size. want distribution of array sizes records can build histogram this:

          | *           | * documents | *         *           | *  *      *           |_*__*__*___*__*___             2  5  6  23  47                 array size 

so raw documents this:

{hubs : [{stuff:0, id:6}, {stuff:1"}, .... ]} {hubs : [{stuff:0, id:6}]}` 

so far using aggregation framework , of here i've come

db.sitedata.aggregate([{ $unwind:'$hubs'},                         { $group : {_id:'$_id', count:{$sum:1}}},                         { $group : {_id:'$count', count:{$sum:1}}},                        { $sort  : {_id: 1}}]) 

this seems give me results want, it's not fast. i'm wondering if there can may not need 2 group calls. syntax wrong here, i'm trying put count value in first _id field:

db.sitedata.aggregate([{ $unwind:'$hubs'},                         { $group : {_id:{$count:$hubs}, count:1}},                        { $sort  : { _id: 1 }}]) 

now 2.6 out, aggregation framework supports new array operator $size allow $project array size without having unwind , re-group.

db.sitedata.aggregate([{ $project:{ 'count': { '$size':'$hubs'} } },                         { $group : {_id:'$count', count:{$sum:1} } },                        { $sort  : { _id: 1 } } ] ) 

Comments

Popular posts from this blog

Why does Ruby on Rails generate add a blank line to the end of a file? -

keyboard - Smiles and long press feature in Android -

node.js - Bad Request - node js ajax post -