Using multiple MongoDB databases instead of one – performance check

I'm starting to develop a new application. Can't say what it is, but it perfectly fits MongoDB Document Oriented Database approach. Everything is great. except of small detail - I don't want to store everything in one database. Of course I could use collections and embedded documents to organize whole nested structures and keep users stuff separated, although it would make source code much more complicated than it should be. Instead I've decided to use one MongoDB database per user. That way I can separate users data and I don't need to worry about scoping it out. There will be a gateway, that will authorize incoming requests to a proper database.

Gateway

Schema is pretty straightforward and the only thing that was bothering me was the multi-db switching performance. I've decided to make a simple benchmark that would test if there's a difference when using one or many databases. Results are promising. It seems, that there's only around 5% (5.3% exactly) performance loss when using many databases instead of one. 5% is a difference level that I can easily accept. To be honest I think, I will gain much more especially when I will have a lot of data. Lets say I have 100 customers with 100 000 000 records. With one database I would have to query all of it. With separate databases, I will have to query only 1% of it.

Below you can see performance difference when querying one vs many MongoDB databases.

Dbs I will definitely go with that approach and I will try to keep you posted.

Note: This is not a full-pro-extremely accurate long-time test - more like a proof of concept. Keep that in mind ;)

Categories: Ruby, Software

5 Comments

  1. Clever idea, but if you have many users I could get some troubles. Also this solution is not appropiate to small (one file sized) files. ‘smallfiles’ option can help in such situation.

  2. Yes, I’m aware of some (hopefully most :D) the issues that I’ll have with this approach:

    • Many users == many DBs to maintain
    • Much harder to perform schema changes/migrations
    • Higher memory usage (indexes per user DB)

    But still – even then with project that I’m building it seems like a way better choice that one monolithic DB. Partially the solution for many-db when a lot of users is MongoDB sharding. Will keep you posted when I’ll reach a beta – probably then I will have much more to say about the issues that I will encounter.

  3. Sure, sharding can save you in future. Which mapper have you used? Mongoid or mongomaper?

  4. I’m going with Mongoid. Used it many times and have a lot of tweaks and features for it. Also it is thread safe as long as I don’t share Moped::Session. Also can easily use many DBs – even with non thread-safe env when using with “with” method. This should allow me to do a wrapper that will handle migrations/multi db changes easily.

  5. Any updates since you posted this article? Did you get a chance to do more benchmarks and see whether or not it was optimal to go with the multiple databases? Are all these databases on one instance or running across multiple instances? I am debating this exact same thing and trying to figure out if i’m going with multiple databases or just one large one and think about sharding in the future.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Copyright © 2024 Closer to Code

Theme by Anders NorenUp ↑