windows下C++程序的捕获功能

背景

因为打算把服务器从win2008 server升级到win2012,所以需要验证下新系统对程序异常的捕获能力。

测试

之前使用的是

link

里面说到的方法。 我很关注 stack Security Checks ,因为类似如下代码,

1
2
3
4
5
void xxx()
{
char szchar[8];
sprintf(szchar, "%s", "1111111111122222");
}

程序会崩溃但不会生成dump(部分崩溃都是这个造成的),这对程序的定位带来了很大的麻烦,所以我在切换操作系统时非常关注这个功能的使用。 但不幸的是,vs2012居然不支持如上的设置方式。

开始我认为应该是hook api的方式,于是google了不少的文章,最后找到靠谱的方法

link

发现实际api的hook是成功的,但crt gs触发的异常就是无法捕获

再搜索文章的过程中看到了这个链接
link

这个文章里面对C++的异常都可以捕获,同时也说明crt的异常说明

1
2
3
4
5
6
7
8
9
Buffer Security Checks
By default, you have the /GS (Buffer Security Check) compiler flag enabled that forces the compiler to inject code that would check for buffer overruns. A buffer overrun is a situation when a large block of data is written to a small buffer.

Note; In Visual C++ .NET (CRT 7.1), you can use the _set_security_error_handler() function that CRT calls when a buffer overrun is detected. However, this function is deprecated in the later versions of CRT.

Since CRT 8.0, you can't intercept the buffer overrun errors in your code. When a buffer overrun is detected, CRT invokes Dr. Watson directly instead of calling the unhandled exception filter. This is done because of security reasons and Microsoft doesn't plan to change this behavior. For additional info, please see these links:

https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=101337
http://blog.kalmbachnet.de/?postid=75

但是在win8.1 和 win10 系统下还是无法捕获栈里面说的异常, 同时在查看 crt代码时,还看到了这段说明

1
2
3
4
5
6
7
8
9
10
// Set up a fake exception, and report it via UnhandledExceptionFilter.
// We can't raise a true exception because the stack (and therefore
// exception handling) can't be trusted. The exception should appear as
// if it originated after the call to __report_securityfailure, so it
// is attributed to the function where the violation occurred.
//
// We assume that the immediate caller of __report_securityfailure is
// the function where the security violation occurred. Note that the
// compiler may elect to emit a jump to this routine instead of a call,
// in which case we will not be able to blame the correct function.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
__declspec(noreturn) void __cdecl __raise_securityfailure(
PEXCEPTION_POINTERS const exception_pointers
)
{
#ifdef _VCRT_BUILD
DebuggerWasPresent = IsDebuggerPresent();
_CRT_DEBUGGER_HOOK(_CRT_DEBUGGER_GSFAILURE);
#endif // _VCRT_BUILD

SetUnhandledExceptionFilter(NULL);
UnhandledExceptionFilter(exception_pointers);

#ifdef _VCRT_BUILD
// If we make it back from Watson, then the user may have asked to debug
// the app. If we weren't under a debugger before invoking Watson,
// re-signal the VS CRT debugger hook, so a newly attached debugger gets
// a chance to break into the process.
if (!DebuggerWasPresent)
{
_CRT_DEBUGGER_HOOK(_CRT_DEBUGGER_GSFAILURE);
}
#endif // _VCRT_BUILD

TerminateProcess(GetCurrentProcess(), STATUS_SECURITY_CHECK_FAILURE);
}

进行代码跟踪调试,也的确走到了SetUnhandledExceptionFilter函数,但就是不进入我hook的api中,很是奇怪,能力有限,可以试着hook KiUserExceptionDispatcher和UnhandledExceptionFilter 。

使用google 的breakpad,进行捕获异常。
分为进程内异常捕获和进程外异常捕获

异常捕获也是无法捕获stack 异常的。

按照 二 中链接给出的异常类型进行验证
link
中的CCrashHandler 捕获能力最好推荐,google的breakpad 可以c/s捕获进程, hook api 可以获取win7下的crt异常

考虑windows本身有AutoDebug功能,修改注册表“\HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Microsoft\Windows NT\CurrentVersion\AeDebug”
auto=1

1
2
Debugger=
"C:\Program Files (x86)\Debugging Tools for Windows (x86)\ntsd.exe" -p %ld -e %ld -g -c ".dump /f /u d:\1.dmp ;q "

这样就可以把异常通过ntsd 自动保存dump。

缺点是: 无法区分进程名是什么

总结

  1. 进程内捕获推荐Exception-Handling + hook api (win7 以前系统均可捕获,win10 只有一种情况无法捕获)

  2. win8.1 win10 系统推荐用 Exception-Handling + autoDebug ; 可参考 异常捕获

  3. google breakpad 可作为备选,优点跨平台

补充

最近线上刚好又出现了异常,这次是在win2008系统下已经hook api的情况下还是出现了异常,并且测试使用Exception-Handling也无法捕获到dump,最后还是依靠AutoDebug自动保存了dump,并分析出来是double free导致heap异常。

但是我使用demo测试模拟double free是不会出现异常的。

最终最靠谱的方案还是使用AutoDebug来捕获最靠谱。