Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maximum size for a queryable database? #2

Closed
chmac opened this issue Aug 7, 2014 · 4 comments
Closed

Maximum size for a queryable database? #2

chmac opened this issue Aug 7, 2014 · 4 comments

Comments

@chmac
Copy link

chmac commented Aug 7, 2014

I'm using queryable to store a simple database of urls. Didn't want the headache of installing / maintaining / etc mongo. I'm crawling a site with around 10k urls and adding some of them to a database. My database currently has ~7k urls and I'm hitting a node memory limit. Could it be caused by putting too much data into queryable?

I could install mongo, but queryable is absolutely perfect for my use case, and keeping the data locally on disk is ideal. Thanks for awesome software.

@chmac
Copy link
Author

chmac commented Aug 7, 2014

This crash hit while my db file was 531K, but I previously generated a 604K db file before hitting a different error. Don't know if that helps at all.

@chmac
Copy link
Author

chmac commented Aug 7, 2014

During my investigating I switched out queryable for nedb and I hit a similar memory leak. I'm therefore confident that the issue doesn't lie in queryable but somewhere else in my code.

Not sure if I'll go back to queryable from nedb, it seems to be a more developed module, but queryable's interface is definitely nice. Thanks again for awesome software.

@chmac chmac closed this as completed Aug 7, 2014
@gmn
Copy link
Owner

gmn commented Sep 25, 2014

Hey, chmac. I just logged in to check messages. I apologize for being off-line for a month. It's been a busy time. I'm guessin that you might've moved on to something else by now, but I'll get to looking at your error this weekend and respond here, hopefully with updated code. best, sorry for the late response.

@chmac
Copy link
Author

chmac commented Sep 28, 2014

I think the memory leak lay elsewhere, I think it was in the heavy spider framework I was using. Once I switched to an alternate based on cheerio, I had the process running for multiple days indexing nearly a million urls with no memory leaks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants