MongoDB Overview for DBA
MongoDB is a cross-platform, document oriented database that provides, high performance, high availability, and easy scalability. MongoDB works on concept of collection and document.
Below given table shows the relationship of RDBMS terminology with MongoDB:
RDBMS | MongoDB |
Database | Database |
Table | Collection |
Tuple/Row | Document |
column | Field |
Table Join | Embedded Documents |
Primary Key | Primary Key (Default key _id provided by mongodb itself) |
Database Server and Client | |
Mysqld/Oracle | mongod |
mysql/sqlplus | mongo |
Database
Database is a physical container for collections. Each database gets its own set of files on the file system. A single MongoDB server typically has multiple databases.
Collection
Collection is a group of MongoDB documents. It is the equivalent of an RDBMS table. A collection exists within a single database. Collections do not enforce a schema. Documents within a collection can have different fields. Typically, all documents in a collection are of similar or related purpose.
Document
A document is a set of key-value pairs. Documents have dynamic schema. Dynamic schema means that documents in the same collection do not need to have the same set of fields or structure, and common fields in a collection’s documents may hold different types of data. MongoDB docu- ments are similar to JSON objects. The values of fields may include other documents, arrays, and arrays of documents.
The advantages of using documents are:
- Documents (i.e. objects) correspond to native data types in many programming
- Embedded documents and arrays reduce need for expensive
- Dynamic schema supports fluent polymorphism
Sample document:
Below given example shows the document structure of a blog site which is simply a comma separated key value pair.
{ _id: ObjectId(7df78ad8902c) title: 'MongoDB Overview', description: 'MongoDB is no sql database', by: 'DB ORG', url: 'https://www.databaseorg.com', tags: ['mongodb', 'database', 'NoSQL'], likes: 100, comments: [ { user:'user1', message: 'My first comment', dateCreated: new Date(2020,1,20,2,15), like: 0 }, { user:'user2', message: 'My second comments', dateCreated: new Date(2020,1,25,7,45), like: 5 } ] }
_id is a 12 bytes hexadecimal number which assures the uniqueness of every document. You can provide _id
while inserting the document. If you didn’t provide then MongoDB provide a unique id for every document.
These 12 bytes first 4 bytes for the current timestamp, next 3 bytes for machine id, next 2 bytes for process id
of mongodb server and remaining 3 bytes are simple incremental value.
Key Features:
High Performance
MongoDB provides high performance data persistence. In particular,
• Support for embedded data models reduces I/O activity on database system.
• Indexes support faster queries and can include keys from embedded documents and arrays.
High Availability
To provide high availability, MongoDB’s replication facility, called replica sets, provide:
• automatic failover.
• data redundancy.
A replica set (page 595) is a group of MongoDB servers that maintain the same data set, providing redundancy and increasing data availability.
Automatic Scaling
MongoDB provides horizontal scalability as part of its core functionality.
• Automatic sharding (page 703) distributes data across a cluster of machines.
• Replica sets can provide eventually-consistent reads for low-latency high throughput deployments.