MongoDB - GridFS

MongoDB - GridFS

GridFS is a MongoDB specification for storing and retrieving large files like images, audio files, video files, etc. It is a kind of file system for storing files, but its data is stored in MongoDB collections. GridFS has the ability to store files even larger than its 16MB document size limit.

GridFS divides the file into chunks and stores each chunk of data in a separate document, each with a maximum size of 255 KB.

GridFS uses two collections fs.files and fs.chunks by default to store file metadata and chunks. Each block is identified by a unique _id ObjectId field. The fs.files file is the parent document. The files_id field in the fs.chunks document associates a chunk with its parent.

Following is an example of fs.files collection document −

{ "filename" : "test.txt" , "chunkSize" : NumberInt ( 261120 ), "uploadDate" : ISODate ( "2014-04-13T11:32:33.557Z" ), "md5" : "7b762939321e146569b07f72c62cca4f" , "length " : NumberInt ( 646 ) }
    
    
    
    
    

The document specifies the file name, fragment size, upload date, and length.

Following is an example fs.chunks document −

{ "files_id" : ObjectId ( "534a75d19f54bfec8a2fe44b" ), "n" : NumberInt ( 0 ), "data" : "Mongo Binary Data" }
    
    
    

Adding Files to GridFS

We will now store the mp3 file using GridFS using the put command . To do this, we will use the mongofiles.exe utility, which is located in the bin folder of the MongoDB installation folder.

Open a command prompt, navigate to the mongofiles.exe file in the bin folder of the MongoDB installation folder and enter the following code −

> mongofiles . exe -d gridfs put song . mp3

Here gridfs is the name of the database where the file will be stored. If the database is missing, MongoDB will automatically create a new document on the fly. Song.mp3 is the name of the downloaded file. To view the document of a file in the database, you can use a search query −

> db . fs . files . find ()

The above command returned the following document −

{ 
   _id : ObjectId ( '534a811bf8b4aa4d33fdf94d' ) , filename  
   : " song.mp3 " ,  
   chunkSize : 261120 ,  
   uploadDate : new Date ( 1397391643474 ) , md5 : " 
   e4f53379c909f7bed2e9d631e15c1c41c41c41        

We can also see all the chunks present in the fs.chunks collection associated with the saved file with the following code using the document id returned in the previous request:

> db . fs . chunks . find ({ files_id : ObjectId ( '534a811bf8b4aa4d33fdf94d' )})

In my case, the query returned 40 documents, which means that the entire mp3 document was divided into 40 blocks of data.