Demystifying MongoDB

Database are at the core of almost all internet and enterprise applications. The need for scale and faster application development has brought in a new generation of databases loosely termed as NoSQL databases. These are in non-tabular format and save data in a different format compared to relational database whereas relational database saves data in rows and columns organized into tables or sheets.

Often developers encounter certain conceptual and technical difficulties with relational database when it is served with an application or multiple applications written in an object-oriented programming language or style. This is because a typical application objects must be mapped onto tables. But a typical application object is rarely related to a single table. This misalignment of application layer objects to tables and rows is called an Impedance Mismatch.

Impedance Mismatch

// your application code
class Foo {int x; string [] tags;}

In the above example, you can see that the object Foo has an integer x and a string of tags. To match Foo to a relational database you need three tables, one for the main object, another for the tags and a third to relate them together. 

To translate between the object in memory and what is saved in the database as an application developer you are forced to develop a mapping layer or use an ORM (Object Relational Mapper). Most of the time objects don’t translate to tables and rows. To top it off we may have to use polymorphism or inheritance. Mapping these on to a relational database can add a lot of complexity to the application.

MongoDB adopts an entirely different approach. There is no schema to define, no tables and no relationships between collections of objects. MongoDB allows you to structure data in a way that is efficient for computers to process and natural and easy for humans to read. Each document you save can be simple or complex as your application demands. This also means that they can adapt or add data when they need without worrying that this minor change would break everything. 

Also, MongoDB differs fundamentally from legacy databases because it can coordinate multiple servers to store data.

Scalability Issues

Relational model requires a relational database engine for itself to manage writes in a special way. To ensure consistency as well as atomicity, it must lock rows and tables and permit only one writer access at a time.

Ensuring referential integrity across the tables and rows increases the time the lock has to be in force. Higher locking time implies less writes and updates per second resulting in higher latency in transactions and thus slower application. Scaling out by replicating or sharding data to other server can make matters even complicated.

If relational engines try to impose consistency and extend these locks across networks, lock times can increase and transactions latency even higher. This obviously results in a slower application.

Relational database addresses such issues by deformalizing tables or by relaxing consistency for instance by allowing dirty reads. But this undermines the purpose behind a relational database. These methods are often used by production systems for reporting or decision support system database where there is a tolerance for a certain margin of error.

“So how does MongoDB approach these issues”

  • First, in MongoDB data is stored in the documents and there are no schema or tables, no rows, and columns and certainly no relationship between tables. The document is grouped into collections. A collection is like a table from a relational database. Although the document lives within a collection updating the document occurs one at a time. This implies that the there are no locking or enforcing of relationships nor constraints and no schema to protect.
  • Two, in replications scenarios, replica sets provide redundancy and high availability with the consistency you choose. A replica set contains several data bearing nodes and optionally one arbiter node of which one and only one member is considered the primary node, while the other nodes are considered secondary nodes. Also, there are no locks from the primary to the secondary. Mongo makes it easy to configure your application for higher latency and lesser consistency or vice versa. With multiple data copies on different database servers, replication provides also provides a level of fault tolerance against the loss of a single database server.
  • Three, Mongo also has a feature called the capped collections. Capped collections have a fixed size and to make room for new documents, oldest documents in the collection are removed automatically without requiring scripts or explicit remove operations .The potential uses of capped collections include storing log information generated by high-volume systems.

In short, it can be said that MongoDB is a cross-platform document-oriented database program designed for ease of development and scaling. The combination of document model and distributed systems components provide MongoDB with an advantage over relational database. With growing data volumes and performance requirements you can just add more servers instead of upgrading to million-dollar mainframe. It is also great for cloud environments where spreading load across several machines is by far the best way to scale.

it companies in dubai Software Development Companies in Dubai | SEO Companies in Dubai | mobile app development company uae | Web Hosting Dubai