database - How to create a fast, queryable index from many JSON files (ideally in Python) -

i have 78,000 individual json files created python script scrapes community forum , extracts information each post. consist of simple key-value pairs, so:

{     "name": "chris wilson",     "item": "darth vader speaker phone",     "price": "$100",     "notes": "great condition!" }

some keys common files -- name , price, example -- while many others appear in some. (the site i'm crawling allows user-defined fields.) want able search, sort, , group field want.

normally, load each file sqlite database , query there. extremely tedious, given multitude of fields.

from little understand nosql frameworks, seems project well-suited document-based system on traditional relational database. tried learn clouddb, of documentation can find assumes start empty database, not pre-fabricated documents themselves.

is there good, reasonably simple (or @ least well-documented) solution indexing , querying large numbers of dictionary objects? prefer python, happy venture node or whatever else.

thank you!

p.s. let me know if you're interested in darth vader phone.

this sounds perfect use case mongodb. setup mongodb , import json files directly collection using mongoimport --file <filename>

they have great python support too.

some documentation links:

http://docs.mongodb.org/manual/reference/mongoimport/#cmdoption-mongoimport--file

http://docs.mongodb.org/ecosystem/drivers/python/

Search This Blog

Babette

database - How to create a fast, queryable index from many JSON files (ideally in Python) -

Comments

Post a Comment

Popular posts from this blog

node.js - Bad Request - node js ajax post -

Why does Ruby on Rails generate add a blank line to the end of a file? -

keyboard - Smiles and long press feature in Android -