問題描述
我有多個具有此架構(gòu)的文檔,每個文檔每天針對每個產(chǎn)品:
I have multiple documents with this schema, each document is per product per day:
{
_id:{},
app_id:'DHJFK67JDSJjdasj909',
date:'2014-08-07',
event_count:32423,
event_count_per_type: {
0:322,
10:4234,
20:653,
30:7562
}
}
我想獲取特定日期范圍內(nèi)每個 event_type 的總和.
這是我正在尋找的輸出,其中每種事件類型已在所有文檔中求和.event_count_per_type 的鍵可以是任何東西,所以我需要一些可以循環(huán)遍歷它們的東西,而不必隱含它們的名稱.
I would like to get the sum of each event_type for a particular date_range.
This is the output I am looking for where each event type has been summed across all the documents. The keys for event_count_per_type can be anything, so I need something that can loop through each of them as opposed to be having to be implicit with their names.
{
app_id:'DHJFK67JDSJjdasj909',
event_count:324236456,
event_count_per_type: {
0:34234222,
10:242354,
20:456476,
30:56756
}
}
到目前為止,我已經(jīng)嘗試了幾個查詢,這是迄今為止我得到的最好的查詢,但是子文檔值沒有相加:
I have been trying several queries so far, this is the best I have got so far but the sub document values are not summed:
db.events.aggregate(
{
$match: {app_id:'DHJFK67JDSJjdasj909'}
},
{
$group: {
_id: {
app_id:'$app_id',
},
event_count: {$sum:'$event_count'},
event_count_per_type: {$sum:'$event_count_per_type'}
}
},
{
$project: {
_id:0,
app_id:'$_id.app_id',
event_count:1,
event_count_per_type:1
}
}
)
我看到的輸出是 event_count_per_type 鍵的值 0,而不是對象.我可以修改架構(gòu),使鍵位于文檔的頂層,但這仍然意味著我需要在每個鍵的組語句中都有一個條目,因為我不知道鍵名是什么,所以我不能做.
The output I am seeing is a value of 0 for the event_count_per_type key, instead of an object. I could modify the schema so the keys are on the top level of the document but that will still mean that I need to have an entry in the group statement for each key, which as I do not know what the key names will be I cannot do.
如有任何幫助,我將不勝感激,如果需要,我愿意更改我的架構(gòu)并嘗試使用 mapReduce(盡管從文檔看來性能很差.)
Any help would be appreciated, I am willing to change my schema if need be and also to try mapReduce (although from the documentation it seems like the performance is bad.)
推薦答案
如上所述,使用聚合框架處理這樣的文檔是不可能的,除非您實際上要提供所有鍵,例如:
As stated, processing documents like this is not possible with the aggregation framework unless you are actually going to supply all of the keys, such as:
db.events.aggregate([
{ "$group": {
"_id": "$app_id",
"event_count": { "$sum": "$event_count" },
"0": { "$sum": "$event_count_per_type.0" },
"10": { "$sum": "$event_count_per_type.10" }
"20": { "$sum": "$event_count_per_type.20" }
"30": { "$sum": "$event_count_per_type.30" }
}}
])
但您當(dāng)然必須明確指定您希望處理的每個鍵.MongoDB 中的聚合框架和一般查詢操作都是如此,因為要訪問以這種子文檔"形式標(biāo)注的元素,您需要指定元素的確切路徑"才能對其進(jìn)行任何操作.
But you do of course have to explicitly specify every key you wish to work on. This is true of both the aggregation framework and general query operations in MongoDB, as to access elements notated in this "sub-document" form you need to specify the "exact path" to the element in order to do anything with it.
聚合框架和通用查詢沒有遍歷"的概念,這意味著它們無法處理文檔的每個鍵".這需要一種語言結(jié)構(gòu)才能完成這些接口中未提供的功能.
The aggregation framework and general queries have no concept of "traversal", which mean they cannot process "each key" of a document. That requires a language construct in order to do which is not provided in these interfaces.
一般來說,使用鍵名"作為其名稱實際上代表值"的數(shù)據(jù)點有點反模式".對此進(jìn)行建模的更好方法是使用數(shù)組并將您的類型"本身表示為值:
Generally speaking though, using a "key name" as a data point where it's name actually represents a "value" is a bit of an "anti-pattern". A better way to model this would be to use an array and represent your "type" as a value by itself:
{
"app_id": "DHJFK67JDSJjdasj909",
"date: ISODate("2014-08-07T00:00:00.000Z"),
"event_count": 32423,
"events": [
{ "type": 0, "value": 322 },
{ "type": 10, "value": 4234 },
{ "type": 20, "value": 653 },
{ "type": 30, "value": 7562 }
]
}
還要注意日期"現(xiàn)在是一個正確的日期對象而不是一個字符串,這也是一個很好的做法.這種數(shù)據(jù)雖然很容易使用聚合框架進(jìn)行處理:
Also noting that the "date" is now a proper date object rather than a string, which is also something that is good practice to do. This sort of data though is easy to process with the aggregation framework:
db.events.aggregate([
{ "$unwind": "$events" },
{ "$group": {
"_id": {
"app_id": "$app_id",
"type": "$events.type"
},
"event_count": { "$sum": "$event_count" },
"value": { "$sum": "$value" }
}},
{ "$group": {
"_id": "$_id.app_id",
"event_count": { "$sum": "$event_count" },
"events": { "$push": { "type": "$_id.type", "value": "$value" } }
}}
])
這顯示了一個兩階段分組,首先獲取每個類型"的總數(shù)而不指定每個鍵",因為您不再需要指定每個鍵",然后作為每個app_id"的單個文檔返回,結(jié)果在數(shù)組中原樣原來存儲的.這種數(shù)據(jù)形式對于查看某些類型"甚至某個范圍內(nèi)的值"通常要靈活得多.
That shows a two stage grouping that first gets the totals per "type" without specifying each "key" since you no longer have to, then returns as a single document per "app_id" with the results in an array as they were originally stored. This data form is generally much more flexible for looking at certain "types" or even the "values" within a certain range.
如果您無法更改結(jié)構(gòu),那么您唯一的選擇是 mapReduce.這允許您編碼"鍵的遍歷,但由于這需要 JavaScript 解釋和執(zhí)行,它不如聚合框架快:
Where you cannot change the structure then your only option is mapReduce. This allows you to "code" the traversal of the keys, but since this requires JavaScript interpretation and execution it is not as fast as the aggregation framework:
db.events.mapReduce(
function() {
emit(
this.app_id,
{
"event_count": this.event_count,
"event_count_per_type": this.event_count_per_type
}
);
},
function(key,values) {
var reduced = { "event_count": 0, "event_count_per_type": {} };
values.forEach(function(value) {
for ( var k in value.event_count_per_type ) {
if ( !redcuced.event_count_per_type.hasOwnProperty(k) )
reduced.event_count_per_type[k] = 0;
reduced.event_count_per_type += value.event_count_per_type;
}
reduced.event_count += value.event_count;
})
},
{
"out": { "inline": 1 }
}
)
這實際上將遍歷并組合鍵",并對找到的每個鍵的值求和.
That will essentially traverse and combine the "keys" and sum up the values for each one found.
所以你的選擇是:
- 更改結(jié)構(gòu)并使用標(biāo)準(zhǔn)查詢和聚合.
- 保持結(jié)構(gòu)不變,需要 JavaScript 處理和 mapReduce.
這取決于您的實際需求,但在大多數(shù)情況下,重組會產(chǎn)生好處.
It depends on your actual needs, but in most cases restructuring yields benefits.
這篇關(guān)于MongoDB匯總子文檔上的每個鍵的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網(wǎng)!