Ayadi Tahar | First taste of Document Databases with MongoDB - V2

First taste of Document Databases with MongoDB - V2

Publish Date: 2022-09-22


MongoDB classified as a NoSQL document database with the scalability and flexibility that developers require for querying and indexing most complex data applications at any scale.

MongoDB stores data in flexible, JSON-like documents with optional schemas (schema-less), meaning fields can vary from document to document and data structure can be changed over time.

In our article today , we will get a look on what is mongodb, how it works, and how to get started with CRUD data manipulations in MongoDB version 6 .

Install MongoDB Community Edition

in the first version of this article we used the default repository, which uses the 3.6 of mongodb version which is quite old, but that was ok for a quick introduction to mongoDb. However the current version which is 6 , have so far improvement and enhancement both in performance and query syntax. and handle more complex data and deal with advancement development requirement.

So, to get started with mongodb coding journey, let’s install mongodb community edition in our operating system . my next steps is from official documentation on how to Install MongoDB Community Edition on Ubuntu you can follow along through the rest of the demo if you have ubuntu machine. otherwise, if you have different operating system, you can find the detailled steps in the official documentation : Install MongoDB Community Edition

the next instructions will use the official mongodb-org package, which is maintained and supported by MongoDB Inc. The official mongodb-org package always contains the latest version of MongoDB, and is available from its own dedicated repo. if you have the old version of mongodb package, make sure to uninstall it first before proceed further.

next code:

1. From a terminal, issue the following command to import the MongoDB public GPG Key from


sudo apt-get install gnupg
wget -qO - https://www.mongodb.org/static/pgp/server-6.0.asc | sudo apt-key add -
OK

2. Create the /etc/apt/sources.list.d/mongodb-org-6.0.list file for Ubuntu 20.04 (Focal):


echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/6.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-6.0.list
deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/6.0 multiverse

3. Issue the following command to reload the local package database:


sudo apt-get update

4. To install the latest stable version of MongoDB packages, issue the following


sudo apt-get install -y mongodb-org
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
  mongodb-database-tools mongodb-mongosh mongodb-org-database mongodb-org-database-tools-extra mongodb-org-mongos mongodb-org-shell
  mongodb-org-tools
The following NEW packages will be installed:
  mongodb-database-tools mongodb-mongosh mongodb-org mongodb-org-database mongodb-org-database-tools-extra mongodb-org-mongos
  mongodb-org-shell mongodb-org-tools
0 upgraded, 8 newly installed, 0 to remove and 12 not upgraded.
Need to get 105 MB of archives.
After this operation, 319 MB of additional disk space will be used.
...
    [some output omitted]
...
start mongod

5. You can start the mongod process by issuing the following command:


sudo systemctl start mongod

if you receive such an error when you check the status:


sudo systemctl start mongod
● mongod.service - MongoDB Database Server
     Loaded: loaded (/lib/systemd/system/mongod.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Wed 2022-09-21 00:39:59 CET; 11s ago
       Docs: https://docs.mongodb.org/manual
    Process: 740174 ExecStart=/usr/bin/mongod --config /etc/mongod.conf (code=exited, status=14)
   Main PID: 740174 (code=exited, status=14)

Sep 21 00:39:59 sony systemd[1]: Started MongoDB Database Server.
Sep 21 00:39:59 sony systemd[1]: mongod.service: Main process exited, code=exited, status=14/n/a
Sep 21 00:39:59 sony systemd[1]: mongod.service: Failed with result 'exit-code'.

that happen because the permission settings on /var/lib/mongodb and /tmp/mongodb-27017.lock are wrong. You will have to change the owner to monogdb user. run the following commands to fix that:


sudo chown -R mongodb:mongodb /var/lib/mongodb
sudo chown mongodb:mongodb /tmp/mongodb-27017.sock

then restart mongod


sudo service mongod restart

now when you check the status, it should be active:


sudo systemctl status mongod
● mongod.service - MongoDB Database Server
     Loaded: loaded (/lib/systemd/system/mongod.service; enabled; vendor preset: enabled)
     Active: active (running) since Wed 2022-09-21 00:41:21 CET; 4s ago
       Docs: https://docs.mongodb.org/manual
   Main PID: 740490 (mongod)
     Memory: 61.4M
     CGroup: /system.slice/mongod.service
             └─740490 /usr/bin/mongod --config /etc/mongod.conf

Sep 21 00:41:21 sony systemd[1]: Started MongoDB Database Server.
Begin using MongoDB.

6. Start a mongosh session on the same host machine as the mongod. You can run mongosh without any command-line options to connect to a mongod that is running on your localhost with default port 27017:


mongosh
Current Mongosh Log ID:	632a4fc9dd57e1f0b0d07b84
Connecting to:		mongodb://127.0.0.1:27017/?directConnection=true&serverSelectionTimeoutMS=2000&appName=mongosh+1.6.0
Using MongoDB:		6.0.1
Using Mongosh:		1.6.0

For mongosh info see: https://docs.mongodb.com/mongodb-shell/


To help improve our products, anonymous usage data is collected and sent to MongoDB periodically (https://www.mongodb.com/legal/privacy-policy).
You can opt-out by running the disableTelemetry() command.

------
   The server generated these startup warnings when booting
   2022-09-21T00:41:21.558+01:00: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine. See http://dochub.mongodb.org/core/prodnotes-filesystem
   2022-09-21T00:41:22.361+01:00: Access control is not enabled for the database. Read and write access to data and configuration is unrestricted
------

------
   Enable MongoDB's free cloud-based monitoring service, which will then receive and display
   metrics about your deployment (disk utilization, CPU, operation statistics, etc).

   The monitoring data will be available on a MongoDB website with a unique URL accessible to you
   and anyone you share the URL with. MongoDB may use this information to make product
   improvements and to suggest MongoDB products and deployment options to you.

   To enable free monitoring, run the following command: db.enableFreeMonitoring()
   To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
------

Warning: Found ~/.mongorc.js, but not ~/.mongoshrc.js. ~/.mongorc.js will not be loaded.
  You may want to copy or rename ~/.mongorc.js to ~/.mongoshrc.js.
test> 

before we dive more with mongo data, let's take a quick look into some related concepts and terminology.

Terminology

Database

a database is a namespace on mongodb server that is uniquely identified by its name, it’s very similar to schema in relational database side. At database level we can set like security, authorizations, and permissions. A mongodb serer can host multiples databases.

db variable is assigned to current connected database in mongo shell, the next line show that current connected database is 'test':


db
test

to switch to another database, just type use, followed by the name of the database you want to change to:


use admin
switched to db admin

to show list of current databases in you mongodb instance, type:


show dbs
admin   40.00 KiB
config  60.00 KiB
local   40.00 KiB

for fresh installation, the mongodb server instance contain 3 databases:

  • admin : administration database for internal system specific collections and user repository for authentication and authorization data.
  • config : store internal informations about shards and replications in the cluster.
  • local : store informations related to a local instance of mongodb.

to create a new database, just use it:


use new_database_name

if you start working with mongodb without creating or specifying any database, it will use test database by default.

Collection

A collection is analogous to a table in relational databases, it is the basic storage unit in the document store. All data manipulations and retrieval is done against collections.

Collections require name and optional options, and follow this general syntax for creation:


db.createCollection('name', {
...     capped : [true|false],
...     size : [number],
...     max : [number]
... });

there is so much other parameters you can define , but in our case the meaning of essential parameters is:

  • Capped: if true, then capped collection will be enabled. the 'capped' field needs to be true when either the 'size' or 'max' fields are present.
  • size: size in bytes.
  • max: maximum number of documents to hold.

Sharding

collections can be sharded or unsharded. so sharding is a process that allows storing portion of collections in multiples instances of a cluster.

sharding enables horizontal scaling for a hundred to thousands instances in large datasets, by dividing large collections into 2 parts or more, and store each part in various instances of the cluster. this parts what we called shards. for the end users, it still shows as a single collection.

As we are using single localhost instance, we will not demonstrate it here, but you got the idea.

Replication

Shards can be replicated. And replication is how mongodb create multiple copies of the same shard for redundancy, to enable fault tolerance, reach high availability and get better performance. and it achieves that using master/slave strategy by making use of replica sets .

a replica set is a group of instances that maintains the same copy of documents, one member of replica set is a primary while others are secondary. if the primary fails , a fail-over mechanism is done by electing one of the secondary members to be primary in the replica set.

primary shard receive all writes, and replication is done to all secondary shards members.

Documents

A document analogous to a record in relational database. document written/read as JSON, and records stored as BSON documents (BSON is a binary representation of JSON documents, though it contains more data types than JSON) . this format is optimized for better performance.

documents in mongodb are not constraint to have similar structure fields, but must at least contain a field id which used by mongodb to uniquely identify documents in sharded/unsharded collections. If _id field is not explicitly defined by the user, then it will be auto generated, it is of type key/value, and it is immutable. it could be of any type except of array.

Though mongodb support a prior schema definition with constraints underlines in which you can validate against documents write, that step of implementation or modeling is quit optional.

Field

A document is a collection of fields, a field is a key value pair just the same as we have in any json document, a hash map or a dictionary. A key is of type character, while a value can be of simple type or complex type .

GridFs

The maximum BSON document size is 16 megabytes. The maximum document size helps ensure that a single document cannot use excessive amount of RAM or, during transmission, excessive amount of bandwidth. To store documents larger than the maximum size, MongoDB provides the GridFS API.

GridFS: it is a convention for storing large binary files in mongodb. It is fast enough to serve read/write operations over this large documents, and provide storage method that conducive for large objects and even for streaming cases.

Crud Operations

As we get familiar with terms and concepts related to mongo db let's get our hands dirty with some examples.

create database

as we saw earlier, to create a database just use it, in our case we will create 'blogs' database:


use blogs
switched to db blogs

MongoDB like many of NoSQL databases, is built on base of Schema On Read behavior(schema-less), which means we can insert documents without need to prior define types and structure of it .for our case we will define it explicitly.

So, lets create a collection named articles:


db.createCollection('articles', {
...     capped : true,
...     size : 200000,
...     max : 1000
... });
{ ok: 1 }
list collections:

to check that our collection 'articles' is been created, we can run the following command:


show collections;
articles

if you don't explicitly create a collection during insertion of documents, mongodb will create one for you .

Insert documents

so let's insert an article (document) in our articles (collection):


db.articles.insertOne({
...     name:"mongodb quick intro",
...     category:"database",
...     tags:[
...         "nosql",
...         "db",
...         "bigdata"
...     ]
... })
{
  acknowledged: true,
  insertedId: ObjectId("632afd9764874a0f48842219")
}

to list the row(document) we just inserted, just use find() like that :


db.articles.find()
[
  {
    _id: ObjectId("632afd9764874a0f48842219"),
    name: 'mongodb quick intro',
    category: 'database',
    tags: [ 'nosql', 'db', 'bigdata' ]
  }
]

in the newer version of mongodb, pretty() method is applied by default.

you can insert many articles at once as well, just wrap them in a list like that:


db.articles.insertMany([
...  {title:"What is a Data Warehouse ?",category:"Big Data",url:"https://en.ayaditahar.com/1", tags:["data","warehouse","business"]},
...  {title:"How To Install MySQL 8 on Ubuntu 20.04",category:"database",url:"https://en.ayaditahar.com/2", tags:["mysql","relational","ubuntu"]},
...  {title:"External Vs Managed Tables in Hive ",category:"Big Data",url:"https://en.ayaditahar.com/3",tags:["external","tables","hive", "manage"]}]
...  );
{
  acknowledged: true,
  insertedIds: {
    '0': ObjectId("632afeeb64874a0f4884221a"),
    '1': ObjectId("632afeeb64874a0f4884221b"),
    '2': ObjectId("632afeeb64874a0f4884221c")
  }
}

to count number of documents in our articles collection:


db.articles.countDocuments()
4

Find Documents

as we just saw, we can use "find" to show all list of documents:


db.articles.find()
[
  {
    _id: ObjectId("632afd9764874a0f48842219"),
    name: 'mongodb quick intro',
    category: 'database',
    tags: [ 'nosql', 'db', 'bigdata' ]
  },
  {
    _id: ObjectId("632afeeb64874a0f4884221a"),
    title: 'What is a Data Warehouse ?',
    category: 'Big Data',
    url: 'https://en.ayaditahar.com/1',
    tags: [ 'data', 'warehouse', 'business' ]
  },
  {
    _id: ObjectId("632afeeb64874a0f4884221b"),
    title: 'How To Install MySQL 8 on Ubuntu 20.04',
    category: 'database',
    url: 'https://en.ayaditahar.com/2',
    tags: [ 'mysql', 'relational', 'ubuntu' ]
  },
  {
    _id: ObjectId("632afeeb64874a0f4884221c"),
    title: 'External Vs Managed Tables in Hive ',
    category: 'Big Data',
    url: 'https://en.ayaditahar.com/3',
    tags: [ 'external', 'tables', 'hive', 'manage' ]
  }
]

as you notice, the first document and other documents doesn't have the same structure and fields, which is one of the feature that makes NoSQL databases like mongodb flexible and more powerful.

however, we can limit the returned results as well. let's say we want to get only 2 documents and list them in human-readable format:


db.articles.find().limit(2)
[
  {
    _id: ObjectId("632afd9764874a0f48842219"),
    name: 'mongodb quick intro',
    category: 'database',
    tags: [ 'nosql', 'db', 'bigdata' ]
  },
  {
    _id: ObjectId("632afeeb64874a0f4884221a"),
    title: 'What is a Data Warehouse ?',
    category: 'Big Data',
    url: 'https://en.ayaditahar.com/1',
    tags: [ 'data', 'warehouse', 'business' ]
  }
]

you can also pick and show only the one document, which is useful in big collections :


db.articles.findOne()
{
  _id: ObjectId("632afd9764874a0f48842219"),
  name: 'mongodb quick intro',
  category: 'database',
  tags: [ 'nosql', 'db', 'bigdata' ]
}

find by field

if you want, you can select only title field from the returned documents and suppress the _id field (if you want to get the id as well, replace 0 by 1).


db.articles.find({}, {title:1, _id:0})
[
  {},
  { title: 'What is a Data Warehouse ?' },
  { title: 'How To Install MySQL 8 on Ubuntu 20.04' },
  { title: 'External Vs Managed Tables in Hive ' }
]

the first curly bracket means fetch all the documents, and the second curly bracket to specify which columns to show:


db.articles.find({}, {title:1,name:1, _id:0})
[
  { name: 'mongodb quick intro' },
  { title: 'What is a Data Warehouse ?' },
  { title: 'How To Install MySQL 8 on Ubuntu 20.04' },
  { title: 'External Vs Managed Tables in Hive ' }
]

if you are looking to a specific document, you need to specify a full value (otherwise you will not get any results) of a field. for instance let's find a document with specific title:


db.articles.find({"title" : "What is a Data Warehouse ?"}, {})
[
  {
    _id: ObjectId("632afeeb64874a0f4884221a"),
    title: 'What is a Data Warehouse ?',
    category: 'Big Data',
    url: 'https://en.ayaditahar.com/1',
    tags: [ 'data', 'warehouse', 'business' ]
  }
]

again, if you want just some specific fields, specify them in the curly bracket with the value 1 next to the field like that :


db.articles.find({"title" : "What is a Data Warehouse ?"}, {title:1, url:1})
[
  {
    _id: ObjectId("632afeeb64874a0f4884221a"),
    title: 'What is a Data Warehouse ?',
    url: 'https://en.ayaditahar.com/1'
  }
]

or you can filter based on some values, here for specific category:


db.articles.find({"category" : "database"}, {})
[
  {
    _id: ObjectId("632afd9764874a0f48842219"),
    name: 'mongodb quick intro',
    category: 'database',
    tags: [ 'nosql', 'db', 'bigdata' ]
  },
  {
    _id: ObjectId("632afeeb64874a0f4884221b"),
    title: 'How To Install MySQL 8 on Ubuntu 20.04',
    category: 'database',
    url: 'https://en.ayaditahar.com/2',
    tags: [ 'mysql', 'relational', 'ubuntu' ]
  }
]

if you try to search through specific word, nothing will return:


db.articles.find({"title" : "Hive"}, {})
    

you can use text query to find documents by words like that:


db.articles.find( { $text: { $search: "hive table" } }, {tags:1, title:1} )
MongoServerError: text index required for $text query

but it seems this is not working either , and from the message error it is clear that it require indexing. So, to make this query work ,you have to enable indexing on at least one column.

Indexes

Before creating any index, lets find how many indexes are already in our database:


db.articles.getIndexes()
[ { v: 2, key: { _id: 1 }, name: '_id_' } ]

as we can see , there is one index in our articles' collection, on the the _id field.

to understand the impact of indexes, execute the next line that will show the execution plan:


db.articles.find({"title" : "hive"}).explain()
{
  explainVersion: '1',
  queryPlanner: {
    namespace: 'blogs.articles',
    indexFilterSet: false,
    parsedQuery: { title: { '$eq': 'hive' } },
    queryHash: '244E9C29',
    planCacheKey: '244E9C29',
    maxIndexedOrSolutionsReached: false,
    maxIndexedAndSolutionsReached: false,
    maxScansToExplodeReached: false,
    winningPlan: {
      stage: 'COLLSCAN',
      filter: { title: { '$eq': 'hive' } },
      direction: 'forward'
    },
    rejectedPlans: []
  },
  command: { find: 'articles', filter: { title: 'hive' }, '$db': 'blogs' },
  serverInfo: {
    host: 'sony',
    port: 27017,
    version: '6.0.1',
    gitVersion: '32f0f9c88dc44a2c8073a5bd47cf779d4bfdee6b'
  },
  serverParameters: {
    internalQueryFacetBufferSizeBytes: 104857600,
    internalQueryFacetMaxOutputDocSizeBytes: 104857600,
    internalLookupStageIntermediateDocumentMaxSizeBytes: 104857600,
    internalDocumentSourceGroupMaxMemoryBytes: 104857600,
    internalQueryMaxBlockingSortMemoryUsageBytes: 104857600,
    internalQueryProhibitBlockingMergeOnMongoS: 0,
    internalQueryMaxAddToSetBytes: 104857600,
    internalDocumentSourceSetWindowFieldsMaxMemoryBytes: 104857600
  },
  ok: 1
}

it's clear from the returning plan, that mongodb try to find if there is an index on that column. if not, then it will try to make a full scan on it. imagine if you have a big collection of documents, this will take a long time before it return a result, which mongodb is not built for that kind of queries. but one way to get through that, is to use indexes.

to create an index, you have to specify which column will be used; we used composed index of 2 columns: title and tags with text type for full text search ability, and we set weight more on tags than on title for score results:


db.articles.createIndex({
...      'tags' : 'text',
...      'title' : 'text'
...      },
...      {
...          'weights' : {
...          'tags' : 5,
...          'title' : 10
...      },
...      'name' : 'tags_title_idx'
...  })
tags_title_idx

if we list our indexes again, we shall see the newly created index (each index has a name within a name field):


db.articles.getIndexes()
[
  { v: 2, key: { _id: 1 }, name: '_id_' },
  {
    v: 2,
    key: { _fts: 'text', _ftsx: 1 },
    name: 'tags_title_idx',
    weights: { tags: 5, title: 10 },
    default_language: 'english',
    language_override: 'language',
    textIndexVersion: 3
  }
]

now, once the index is created, we can search documents on specific words. it should return results as expected:


db.articles.find( { $text: { $search: "hive table" } }, {tags:1, title:1} )
[
  {
    _id: ObjectId("632afeeb64874a0f4884221c"),
    title: 'External Vs Managed Tables in Hive ',
    tags: [ 'external', 'tables', 'hive', 'manage' ]
  }
]

you can search even with multiples words, and suppress the _id field from results if you want to:


db.articles.find( { $text: { $search: "20.04 warehouse" } }, {tags:1, title:1, _id:0} )
[
  {
    title: 'How To Install MySQL 8 on Ubuntu 20.04',
    tags: [ 'mysql', 'relational', 'ubuntu' ]
  },
  {
    title: 'What is a Data Warehouse ?',
    tags: [ 'data', 'warehouse', 'business' ]
  }
]

Update Documents

whenever you want to update a document in the collection , you have to specify two parts: query part and the updating document part.

if you notice our first document doesn't contain a 'title' field, but instead it contains a 'name' field in place, which makes our data inconsistent, and that something we do not want to. So, to correct that we rename the "name" field to "title" field:

db.articles.find({"name" : "mongodb quick intro"}, {})
[
  {
    _id: ObjectId("632afd9764874a0f48842219"),
    name: 'mongodb quick intro',
    category: 'database',
    tags: [ 'nosql', 'db', 'bigdata' ]
  }
]

db.articles.updateOne({"name" : "mongodb quick intro"},{ $rename: { "name": "title" } })
{
  acknowledged: true,
  insertedId: null,
  matchedCount: 1,
  modifiedCount: 1,
  upsertedCount: 0
}

now if you select a document by the title field, you will get results as expected:


db.articles.find({"title" : "mongodb quick intro"}, {})
[
  {
    _id: ObjectId("632afd9764874a0f48842219"),
    category: 'database',
    tags: [ 'nosql', 'db', 'bigdata' ],
    title: 'mongodb quick intro'
  }
]

though it moves the field at the end of the document, but it doesn't matter much in mongodb.

now, once you change the field "name" to "title", it get indexed as well. and you can search words in our first document efficiently:


db.articles.find( { $text: { $search: "mongodb" } }, {tags:1, title:1} )
[
  {
    _id: ObjectId("632afd9764874a0f48842219"),
    tags: [ 'nosql', 'db', 'bigdata' ],
    title: 'mongodb quick intro'
  }
]

another case example, let's update our first article we insert earlier and set 'url' to it, and change it's 'title' a new one:


db.articles.updateOne({"title" : "mongodb quick intro"}, {$set : {"url" : "https://en.ayaditahar.com/10"}})
{
  acknowledged: true,
  insertedId: null,
  matchedCount: 1,
  modifiedCount: 1,
  upsertedCount: 0
}

db.articles.updateOne({"title" : "mongodb quick intro"}, {$set : {"title" : "First taste of Document Databases in MongoDB"}})
{
  acknowledged: true,
  insertedId: null,
  matchedCount: 1,
  modifiedCount: 1,
  upsertedCount: 0
}

now we can check our article document, to see if it updated successfully:


db.articles.find({"title" : "First taste of Document Databases in MongoDB"})
[
  {
    _id: ObjectId("632afd9764874a0f48842219"),
    category: 'database',
    tags: [ 'nosql', 'db', 'bigdata' ],
    title: 'First taste of Document Databases in MongoDB',
    url: 'https://en.ayaditahar.com/10'
  }
]

actually, there is a small mistake in the url field, it has to have a 'post' keyword to be a running ones. so we have to update all articles url's at once.

before going further, let's check the articles urls once again:

db.articles.find({},{url:1, _id:0})
[
  { url: 'https://en.ayaditahar.com/10' },
  { url: 'https://en.ayaditahar.com/1' },
  { url: 'https://en.ayaditahar.com/2' },
  { url: 'https://en.ayaditahar.com/3' }
]

there is a couple of ways to update multiples documents in mongodb at one step, but for now let's execute the next following code snippet to update many articles in one go:


db.articles.updateMany(
...   { url: { $regex: /com/ } },
...   [{
...     $set: { url: {
...       $replaceOne: { input: "$url", find: "com", replacement: "com/post" }
...     }}
...   }]
... )
{
  acknowledged: true,
  insertedId: null,
  matchedCount: 4,
  modifiedCount: 4,
  upsertedCount: 0
}

it's clear that 4 rows get modified, but let's check again to verify if that happen as we want to:


db.articles.find({},{url:1})
[
  { url: 'https://en.ayaditahar.com/post/10' },
  { url: 'https://en.ayaditahar.com/post/1' },
  { url: 'https://en.ayaditahar.com/post/2' },
  { url: 'https://en.ayaditahar.com/post/3' }
]

db.articles.find({},{title:1, url:1, _id:0})
[
  {
    title: 'First taste of Document Databases in MongoDB',
    url: 'https://en.ayaditahar.com/post/10'
  },
  {
    title: 'What is a Data Warehouse ?',
    url: 'https://en.ayaditahar.com/post/1'
  },
  {
    title: 'How To Install MySQL 8 on Ubuntu 20.04',
    url: 'https://en.ayaditahar.com/post/2'
  },
  {
    title: 'External Vs Managed Tables in Hive ',
    url: 'https://en.ayaditahar.com/post/3'
  }
]
Delete documents

Now, to delete an article, just call deleteOne() method on that article (document), and specify a criteria on a field as a query :


db.articles.deleteOne({'title' : "First taste of Document Databases in MongoDB"})
{ acknowledged: true, deletedCount: 1 }

you can delete multiples documents based on specific criteria, for example if we want to delete documents that belongs to category = "Big Data", we can so by executing the next snippet:


db.articles.deleteMany({category:"Big Data"})
{ acknowledged: true, deletedCount: 2 }

because we have two articles document in our collection belongs to big data category, they got deleted.

cleanup

now, after we're done with our demonstration , we can clean up our namespace and delete all the objects we created so far.

drop index

if we decide that we don't need an index anymore, we can easily drop it, by specifying its name:


db.articles.dropIndex('tags_title_idx')
{ nIndexesWas: 2, ok: 1 }
Purge documents

also to delete (empty or truncate) the "articles" collection, you can so by invoking the deleteMany method without parameters. because only one document is left in our collection, the next snippet will delete that document:


db.articles.deleteMany({})
{ acknowledged: true, deletedCount: 1 }
drop collection

after our collection is empty, you can drop it:


db.articles.drop()
true
drop database

now, the only thing left to us is to delete the database. the next command will delete the currently connected database from mongosh shell:


db.dropDatabase()
{ ok: 1, dropped: 'blogs' }

Conclusion

As MongoDB stores data in document form in a collection and several collection methods are used to perform CRUD operations that include creation, retrieval, updating, and deletion of documents.

the latest version of mongoDB brings a lot of enhancement and improvements, and we demonstrate some of them through our dmo article. and I hope by now you are at least know what is mongoDB and how to deal with it at basic level.

Resources