問題描述
在 C++ 類(或其任何父類)中至少有一個虛擬方法意味著該類將有一個虛擬表,并且每個實例都有一個虛擬指針.
Having at least one virtual method in a C++ class (or any of its parent classes) means that the class will have a virtual table, and every instance will have a virtual pointer.
所以內存開銷就很清楚了.最重要的是實例的內存成本(特別是如果實例很小,例如如果它們只是打算包含一個整數:在這種情況下,在每個實例中都有一個虛擬指針可能會使實例的大小增加一倍.至于虛擬表使用的內存空間,我想與實際方法代碼使用的空間相比,通常可以忽略不計.
So the memory cost is quite clear. The most important is the memory cost on the instances (especially if the instances are small, for example if they are just meant to contain an integer: in this case having a virtual pointer in every instance might double the size of the instances. As for the memory space used up by the virtual tables, I guess it is usually negligible compared to the space used up by the actual method code.
這讓我想到了一個問題:使方法虛擬化是否存在可衡量的性能成本(即速度影響)?每次調用方法時都會在運行時在虛擬表中進行查找,因此如果對此方法的調用非常頻繁,并且如果此方法很短,那么可能會出現可衡量的性能下降?我想這取決于平臺,但有人運行過一些基準測試嗎?
This brings me to my question: is there a measurable performance cost (i.e. speed impact) for making a method virtual? There will be a lookup in the virtual table at runtime, upon every method call, so if there are very frequent calls to this method, and if this method is very short, then there might be a measurable performance hit? I guess it depends on the platform, but has anyone run some benchmarks?
我問這個問題的原因是我遇到了一個錯誤,該錯誤是由于程序員忘記定義 virtual 方法所致.這不是我第一次看到這種錯誤.我想:為什么我們在需要時添加 virtual 關鍵字而不是刪除 virtual 關鍵字,而我們絕對確定它不需要?如果性能成本低,我想我會在我的團隊中簡單地推薦以下內容:在每個類中將every方法默認設為虛擬,包括析構函數,并且僅在需要時將其刪除.你覺得這很瘋狂嗎?
The reason I am asking is that I came across a bug that happened to be due to a programmer forgetting to define a method virtual. This is not the first time I see this kind of mistake. And I thought: why do we add the virtual keyword when needed instead of removing the virtual keyword when we are absolutely sure that it is not needed? If the performance cost is low, I think I will simply recommend the following in my team: simply make every method virtual by default, including the destructor, in every class, and only remove it when you need to. Does that sound crazy to you?
推薦答案
I 在 3ghz 有序 PowerPC 處理器上運行一些計時.在該架構上,虛擬函數調用比直接(非虛擬)函數調用多花費 7 納秒.
I ran some timings on a 3ghz in-order PowerPC processor. On that architecture, a virtual function call costs 7 nanoseconds longer than a direct (non-virtual) function call.
因此,除非函數類似于簡單的 Get()/Set() 訪問器,否則不值得擔心成本,其中除內聯之外的任何東西都有些浪費.內聯到 0.5ns 的函數的 7ns 開銷是嚴重的;一個需要 500 毫秒來執行的函數的 7 納秒開銷是沒有意義的.
So, not really worth worrying about the cost unless the function is something like a trivial Get()/Set() accessor, in which anything other than inline is kind of wasteful. A 7ns overhead on a function that inlines to 0.5ns is severe; a 7ns overhead on a function that takes 500ms to execute is meaningless.
虛函數的巨大成本實際上并不是在 vtable 中查找函數指針(通常只是一個循環),而是間接跳轉通常無法進行分支預測.這可能會導致大的流水線氣泡,因為在間接跳轉(通過函數指針的調用)退出并計算新的指令指針之前,處理器無法獲取任何指令.因此,虛函數調用的成本比從程序集看起來要大得多……但仍然只有 7 納秒.
The big cost of virtual functions isn't really the lookup of a function pointer in the vtable (that's usually just a single cycle), but that the indirect jump usually cannot be branch-predicted. This can cause a large pipeline bubble as the processor cannot fetch any instructions until the indirect jump (the call through the function pointer) has retired and a new instruction pointer computed. So, the cost of a virtual function call is much bigger than it might seem from looking at the assembly... but still only 7 nanoseconds.
Andrew、不確定和其他人也提出了一個很好的觀點,即虛函數調用可能導致指令緩存未命中:如果跳轉到不在緩存中的代碼地址,那么當指令從主存儲器中取出時,整個程序就停止了.這總是一個明顯的停頓:在氙氣上,大約 650 個周期(根據我的測試).
Andrew, Not Sure, and others also raise the very good point that a virtual function call may cause an instruction cache miss: if you jump to a code address that is not in cache then the whole program comes to a dead halt while the instructions are fetched from main memory. This is always a significant stall: on Xenon, about 650 cycles (by my tests).
然而,這不是虛函數特有的問題,因為如果跳轉到不在緩存中的指令,即使是直接的函數調用也會導致未命中.重要的是該函數是否最近運行過(使其更有可能在緩存中),以及您的架構是否可以預測靜態(非虛擬)分支并提前將這些指令提取到緩存中.我的 PPC 沒有,但也許英特爾最新的硬件有.
However this isn't a problem specific to virtual functions because even a direct function call will cause a miss if you jump to instructions that aren't in cache. What matters is whether the function has been run before recently (making it more likely to be in cache), and whether your architecture can predict static (not virtual) branches and fetch those instructions into cache ahead of time. My PPC does not, but maybe Intel's most recent hardware does.
我的時間控制了 icache 未命中對執行的影響(故意的,因為我試圖孤立地檢查 CPU 管道),所以他們打折了這個成本.
My timings control for the influence of icache misses on execution (deliberately, since I was trying to examine the CPU pipeline in isolation), so they discount that cost.
這篇關于在 C++ 類中使用虛方法的性能成本是多少?的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!