2007年5月26日 星期六

GCC Visibility與軟體最佳化

以前在blog文章「Qt Library 的精簡」提到GCC的C++ Visibility,其實官方的wiki已經描述相當清楚,引述如下:

"Why is the new C++ visibility support so useful?

Put simply, it hides most of the ELF symbols which would have previously (and unnecessarily) been public. This means:

  • It very substantially improves load times of your DSO (Dynamic Shared Object). For example, a huge C++ template-based library which was tested (the TnFOX Boost.Python bindings library) now loads in eight seconds rather than over six minutes!

  • It lets the optimiser produce better code. PLT indirections (when a function call or variable access must be looked up via the Global Offset Table such as in PIC code) can be completely avoided, thus substantially avoiding pipeline stalls on modern processors and thus much faster code. Furthermore when most of the symbols are bound locally, they can be safely elided (removed) completely through the entire DSO. This gives greater latitude especially to the inliner which no longer needs to keep an entry point around "just in case".

  • It reduces the size of your DSO by 5-20%. ELF's exported symbol table format is quite a space hog, giving the complete mangled symbol name which with heavy template usage can average around 1000 bytes. C++ templates spew out a huge amount of symbols and a typical C++ library can easily surpass 30,000 symbols which is around 5-6Mb! Therefore if you cut out the 60-80% of unnecessary symbols, your DSO can be megabytes smaller!

  • Much lower chance of symbol collision. The old woe of two libraries internally using the same symbol for different things is finally behind us with this patch. Hallelujah! "


事實上,對於許多採用C語言撰寫的專案也適用,而且效果很不錯。嵌入式系統開發時常常得控制程式空間使用量,咱們就來看看具體的案例,包含Nokia 770/800與OpenMoko在內的許多專案,採用X Window System,server端的實做是KDrive,而client端雖然只要能跟X Protocol即可,不限定程式語言與實做,但往往我們會透過libX11。就如筆者之前的演講與blog提及,X本身的效率其實很好,但問題在於複雜的軟體實做,其中有許多改進的空間,libX11就是一例。

剛剛做了些hacking,發現光是利用GCC的C++ Visibility來隱藏非公開的API/ELF symbol,即可大幅降低空間使用量,並提昇載入應用程式速度。

在Ubuntu 7.04上進行測試,同樣的libX11套件,施加我的修改:libX11-visibility.patch後重新編譯,比較兩者的差異: (stripped)
  • /usr/lib/libX11.so.6.2.0 (原本) - 964K
  • dist/usr/lib/libX11.so.6.2.0 (修改過) - 839K
相當顯著的差異,接著比較兩者透過size指令的落差:
textures datatype bss_seg decided hexunfilenamefilename
966896 14496 1596 982988 effcc/usr/lib/libX11.so.6.2.0 (原本)

textures datatype bss_seg decided hexunfilename
827962 11336 1224 840522 cd34adist/usr/lib/libX11.so.6.2.0 (修改過)

於是乎,我們可參照wiki上對於DSO的描述,這之間的意義不僅是空間的降低,對於Code optimizer來說,也允許更多積極的處理方式,最重要的是,符號解析的速度提昇,也避免潛在的符號衝突。這是很簡單的修改,但影響卻相當顯著,我們也可看到整個Gtk+/GNOME的架構仍有最佳化的空間。

2 則留言:

希望 阳光 幸福 提到...

菜鸟想你学习了 呵呵 很高兴认识你

eyes的友情链接的链接 看到你的
很高兴认识你

广告已经帮你点击了

如果您愿意有空的话我们每天坚持互访

谢谢

jserv 提到...

您好,

歡迎指教,也請善用討論群組與我們交流:
http://groups.google.com/group/orzlab

Regards,
-jserv