The best MongoDB Map Reduce Tutorial In 2024, In this tutorial you can learn MapReduce command,Use MapReduce,
Map-Reduce is a computing model, simply means that the bulk of the work (data) decomposition (MAP) to perform, and then merge the results into a final result (REDUCE).
MongoDB provides a Map-Reduce is very flexible for large-scale data analysis is also quite practical.
The following is the basic syntax of MapReduce:
>db.collection.mapReduce( function() {emit(key,value);}, //map 函数 function(key,values) {return reduceFunction}, //reduce 函数 { out: collection, query: document, sort: document, limit: number } )
Using MapReduce functions to achieve the two functions Map and Reduce functions, Map function call emit (key, value), traverse the collection in all the records, and the key value is passed to the Reduce function for processing.
Map function must call emit (key, value) Returns pairs.
Parameter Description:
Consider the following document structure to store the user's articles, documents, and stores the user user_name article status field:
>db.posts.insert({ "post_text": "本教程,最全的技术文档。", "user_name": "mark", "status":"active" }) WriteResult({ "nInserted" : 1 }) >db.posts.insert({ "post_text": "本教程,最全的技术文档。", "user_name": "mark", "status":"active" }) WriteResult({ "nInserted" : 1 }) >db.posts.insert({ "post_text": "本教程,最全的技术文档。", "user_name": "mark", "status":"active" }) WriteResult({ "nInserted" : 1 }) >db.posts.insert({ "post_text": "本教程,最全的技术文档。", "user_name": "mark", "status":"active" }) WriteResult({ "nInserted" : 1 }) >db.posts.insert({ "post_text": "本教程,最全的技术文档。", "user_name": "mark", "status":"disabled" }) WriteResult({ "nInserted" : 1 }) >db.posts.insert({ "post_text": "本教程,最全的技术文档。", "user_name": "w3big", "status":"disabled" }) WriteResult({ "nInserted" : 1 }) >db.posts.insert({ "post_text": "本教程,最全的技术文档。", "user_name": "w3big", "status":"disabled" }) WriteResult({ "nInserted" : 1 }) >db.posts.insert({ "post_text": "本教程,最全的技术文档。", "user_name": "w3big", "status":"active" }) WriteResult({ "nInserted" : 1 })
Now we will use the posts set mapReduce function to select a published article (status: "active"), and by user_name packet calculated for each user Posts:
>db.posts.mapReduce( function() { emit(this.user_name,1); }, function(key, values) {return Array.sum(values)}, { query:{status:"active"}, out:"post_total" } )
Above mapReduce output is:
{ "result" : "post_total", "timeMillis" : 23, "counts" : { "input" : 5, "emit" : 5, "reduce" : 1, "output" : 2 }, "ok" : 1 }
The results showed that a total of four match the query criteria (status: "active") documents generated four key in the map function in the document, and then use the same function to reduce key divided into two groups.
Specific parameters:
Using the find operator to view the query results mapReduce of:
>db.posts.mapReduce( function() { emit(this.user_name,1); }, function(key, values) {return Array.sum(values)}, { query:{status:"active"}, out:"post_total" } ).find()
The results of the above query is shown below, there are two users tom and mark two articles published:
{ "_id" : "mark", "value" : 4 } { "_id" : "w3big", "value" : 1 }
In a similar manner, MapReduce can be used to build large, complex aggregate queries.
Map function and Reduce functions can be implemented using JavaScript, MapReduce make use of very flexible and powerful.