問題描述
我有一個保存大量數據的系統.使用的數據庫是 SQL Server.其中一個表有大約 300000 行,并且有很多這種大小的表.此表會定期更新 - 我們將其稱為事務數據庫",其中發生事務.
I have a system that holds some big amount of data. The database used is SQL Server. One of the tables have around 300000 rows, and there are quite a few number of tables of this size. There happens regular updates on this table - we say this as "transactional database" where transactions are happening.
現在,我們需要實現報告功能.一些架構師建議使用不同的數據庫,該數據庫是該數據庫的副本 + 一些用于報告的附加表.他們提議這樣做是因為他們不想破壞事務性數據庫功能.為此,必須經常將數據移動到報告數據庫.我的問題是,是否真的需要為此目的擁有第二個數據庫?我們可以將事務數據庫本身用于報告目的嗎?由于必須將數據移動到不同的數據庫,因此會涉及延遲,如果事務數據庫本身用于報告,則情況并非如此.期待一些專家的建議.
Now, we need to implement a reporting functionality. Some of the architect folks are proposing a different database which is a copy of this database + some additional tables for reporting. They propose this because they do not want to disrupt the transactional database functionality. For this, data has to be moved to the reporting database frequently. My question here is, is it really required to have second database for this purpose? Can we use the transactional database itself for reporting purposes? Since the data has to be moved to a different database, there will be latency involved which is not the case if the transactional database itself is used for reporting. Expecting some expert advice.
推薦答案
您需要對 ETL、數據倉庫和報告數據庫進行一些研究,因為我認為您的架構師可能會很好地解決這個問題.由于您沒有提供實際報告的詳細信息,我將嘗試回答一般情況.
You need to do some research into ETLs, Data Warehousing and Reporting databases, as I think your architects may be addressing this in a good way. Since you don't give details of the actual reports I'll try and answer the general case.
(免責聲明:我在這個領域工作,我們有適合這個領域的產品)
(Disclaimer: I work in this field and we have products geared to this)
事務數據庫針對讀取/更新/插入之間的良好平衡進行了優化,并且索引和表規范化針對此效果.
Transactional databases are optimised for a good balance between read/update/insert, and the indexes and table normalisations are geared to this effect.
報告數據庫非常適合讀取訪問而不是其他所有事情.這意味著將應用于事務數據庫的正常"規范化規則將不適用.事實上,高度的非規范化可能已經到位,以使報告查詢更有效、更易于管理.
Reporting databases are geared to be very very optimal for read access over and above all other things. This means that the 'normal' normalisation rules that one would apply to a transactional database won't apply. In fact high degrees of de-normalisation may be in place to make the report queries way more efficient and simpler to manage.
在事務數據庫上運行復雜的(尤其是在擴展數據范圍內的聚合,例如歷史時間框架)查詢,可能會影響性能,從而使數據庫的關鍵用戶 - 事務生成器可能受到負面影響.
Running complex (especially aggregations over extended data ranges such as historical time frames) queries on transactional database, may impact the performance such that the key users of the database - the transaction generators could be negatively impacted.
雖然在您的情況下可能不需要報告數據庫,但您可能會發現將兩個用例分開更簡單.
Though a reporting database may not be required in your situation you may find that the it's simpler to keep the two use cases separate.
您對數據延遲的擔憂是真實存在的.這只能由將使用報告的業務用戶回答.人們通常會說我們想要實時信息",而實際上很多(如果不是全部)他們的需求都包含在非實時信息中.可接受的數據陳舊程度只能由他們來回答
Your concern about the data latency is a real one. This can only be answered by the business users who will consume the reports. Often people say "We want real time info" when in fact lots if not all of their requirements are covered with non real time info. The acceptable degree of data staleness can only be answered by them
事實上,我建議您稍微進一步研究一下,并針對您的報告關注點查看多維多維數據集,而不僅僅是報告數據庫.將您的報告問題抽象到全新的水平.
In fact I'd suggest that you take your research slight further and look at multidimensional cubes for your report concerns as opposed just reporting databases. There are designed abstract your reporting concerns to whole new level.
這篇關于用于報告和日常交易的數據庫的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!