MongoDB - Map Zoom Out

MongoDB - Map Zoom Out

According to the MongoDB documentation, Map-Reduction is a data processing paradigm for combining large amounts of data into useful aggregated results. MongoDB uses the mapReduce command for map reduction operations. MapReduce is typically used to process large datasets.

MapReduce command

Following is the syntax of the basic mapReduce command −

> db . collection . mapReduce ( function () { emit ( key , value );}, //map function function ( key , values ) { return reduceFunction }, { //reduce function out : collection , 
      query : document , 
      sort : document , 
      limit : number
    } )
      
        
      

The map-Reduce function first queries the collection and then maps the resulting documents to generate key-value pairs, which are then reduced based on keys that have multiple values.

In the above syntax −

  • map is a JavaScript function that maps a value with a key and emits a key-value pair

  • Reduce is a JavaScript function that reduces or groups all documents that have the same key.

  • out specifies the location of the result of the map reduction request

  • The query specifies optional selection criteria for selecting documents.

  • sort defines optional sorting criteria

  • Limit specifies an optional maximum number of documents to be returned

map is a JavaScript function that maps a value with a key and emits a key-value pair

Reduce is a JavaScript function that reduces or groups all documents that have the same key.

out specifies the location of the result of the map reduction request

The query specifies optional selection criteria for selecting documents.

sort defines optional sorting criteria

Limit specifies an optional maximum number of documents to be returned

Using MapReduce

Consider the following document structure that stores user messages. The document stores the user_name of the user and the status of the post.

{ "post_text" : "tutorialspoint is an awesome website for tutorials" , "user_name" : "mark" , "status" : "active" }
    
    
   

Now we will use the mapReduce function on our post collection to select all active posts, group them by username and then count the number of posts of each user using the following code −

> db . posts . mapReduce ( function () { emit ( this . user_id , 1 ); }, 
      
	
   function ( key , values ) { return Array . sum ( values )}, {   
      query :{ status : "active" }, out : "post_total" } )     
       
   

The above mapReduce query produces the following output:

{
   "result" : "post_total",
   "timeMillis" : 9,
   "counts" : {
      "input" : 4,
      "emit" : 4,
      "reduce" : 2,
      "output" : 2
   },
   "ok" : 1,
}

The result shows that a total of 4 documents matched the query (status: "active"), the map function returned 4 documents with key-value pairs, and finally the reduce function grouped matched documents with the same keys into 2.

To see the result of this mapReduce query, use the find statement −

> db . posts . mapReduce ( function () { emit ( this . user_id , 1 ); }, function ( key , values ) { return Array . sum ( values )}, {   
      query :{ status : "active" }, out : "post_total" } 
      
        
       
   
	
). find ()

The above query gives the following result which indicates that both users tom and mark have two messages in active states −

{ "_id" : "tom" , "value" : 2 } { "_id" : "mark" , "value" : 2 }       
       

Similarly, MapReduce queries can be used to build large, complex aggregation queries. Using custom Javascript functions makes using MapReduce very flexible and powerful.