java - Is modeling infinite-scale relationships in NoSQL / BigTable (GAE) possible? -


my team writing application gae (java) has led me question scalability of entity relationship modeling (specifically many-to-many) in object oriented databases bigtable.

the preferred solution modeling unowned one-to-many , many-to-many relationships in app engine datastore (see entity relationships in jdo) seems list-of-keys. however, google warns:

"there few limitations implementing many-to-many relationships way. first, must explicitly retrieve values on side of collection list stored since have available key objects. more important 1 want avoid storing overly large lists of keys..."

speaking of overly large lists of keys, if attempt model way , assume storing 1 long each key per-entity limit of 1mb theoretical maximum number of relationships per entity ~130k. platform who's primary advantage scalabililty, that's not many relationships. looking @ possibly sharding entities require more 130k relationships.

a different approach (relationship model) outlined in article modeling entity relationships part of mastering datastore series in appengine developer resources. however, here google warns performance of relational models:

"however, need careful because traversing connections of collection require more calls datastore. use kind of many-to-many relationship when need to, , care performance of application."

so asking: 'why need more 130k relationships per-entity?' i'm glad asked. let's take, example, cms application 1 million users (hey can dream right?!)

users can upload content , share with: 1. public 2. individuals 3. groups 4. combination

now logs in, , navigates dashboard shows new uploads people connected in group. dashboard should include public content, , content shared user or group user member of. not bad right? let's dig it.

public class content {   private long id;   private long authorid;   private list<long> sharedwith; //can individual ids or group ids } 

now query id allowed see might this:

list<long> idsthatgivemeaccess = new arraylist<long>(); idsthatgivemeaccess.add(myid); idsthatgivemeaccess.add(publicid); //let's sharing 0l makes public (group g : groupsimin)     idsthatgivemeaccess.add(g.getid());  list<long> authoridsthatiwanttosee = new arraylist<long>(); //add bunch of authorids  query q = new query("content")             .addfilter("authorid", query.filteroperator.in, authoridsthatiwanttosee)             .addfilter("sharedwith", query.filteroperator.in, idsthatgivemeaccess); 

obviously i've broken several rules. namely, using 2 in filters blow up. single in filter @ size approaching limits talking blow up. aside that, let's want limit , page through results... no no! can't if use in filter. can't think of way operation in single query - means can't paginate without extensive read-time processing , managing multiple cursors.

so here tools can think of doing this: denormalization, sharding, or relationship entities. these concepts don't see how possible model data in way scale. it's possible. google , others time. can't see how. can shed light on how model or point me toward resources cms-style access control based on nosql db?

storing list of ids property wont scale. why not store new object each new relationship? (like in sql). object store cms 2 properties: id of shared item , user id. if shared 1000 users have 1000 of these. querying given user trivial. listing permissions given item or list of user has shared them easy too.


Comments

Popular posts from this blog

Why does Ruby on Rails generate add a blank line to the end of a file? -

keyboard - Smiles and long press feature in Android -

node.js - Bad Request - node js ajax post -