【机翻】HDD Firmware Hacking Part 1 HDD 固件破解 第一部分
原文链接:https://icode4.coffee/?p=1465

- 【机翻】HDD Firmware Hacking Part 1 HDD 固件破解 第一部分
- Background 背景
- The Test Subjects 测试对象
- Spinning Up 正在启动
- Obtaining the Drive Firmware 获取驱动器固件
- Western Digital 西部数据
- Samsung PM871a 三星 PM871a
- Samsung HM020GI 三星 HM020GI
- Flashing Modified Firmware 刷写修改后的固件
- DOWNLOAD MICROCODE Command (DOWNLOAD MICROCODE 命令)
- Back Door Vendor Commands 后门供应商命令
- Physical Serial Interface 物理串行接口
- The Western Digital SPI Flash 西部数据 SPI 闪存
- Analyzing the Firmware 分析固件
- You Ever Debug a Hard Drive Before? 你以前调试过硬盘吗?
- Vendor Specific Commands 特定厂商命令
- Into the Belly of the Beast 深入猛兽之腹
- Where the FSCK is this code? FSCK代码在哪里?
- Patching the Firmware修补固件
- Task Failed Successfully 任务成功失败
- Conclusion 结论
Some time last year I was working on an exploit for the Xbox 360 console (which would later turn into the much anticipated softmod) and found myself in need of a way to modify the firmware for a HDD to try and exploit a race condition. This sent me down a rabbit hole of trying to modify the firmware for a few different brands of HDDs and SSDs I had on hand. In this series of blog posts I’ll cover all the work I did including: dumping and analyzing the firmware, live debugging a HDD via JTAG, modifying the drive firmware, and how I used AI to help with analysis and identifying an unknown MCU architecture.
去年某个时候,我正在为 Xbox 360 游戏机开发一个漏洞(后来这个漏洞成为了备受期待的软解),并且发现需要一种方法来修改硬盘的固件,试图利用一个竞态条件。这让我陷入了一个修改我手头拥有的几个不同品牌硬盘和固态硬盘固件的漩涡中。在这个博客系列中,我将涵盖我所做的工作,包括:固件转储和分析、通过 JTAG 对硬盘进行实时调试、修改驱动器固件,以及我如何使用人工智能来帮助分析和识别未知微控制器架构。
This first post is going to focus on dumping, analyzing, and modifying HDD firmware. Everything in this post was done without the help of AI. In the next post I’ll cover how I used AI to do similar work on other HDDs/SSDs as well as using it to do black box reverse engineering on an unknown ISA, and giving Claude access to debug my hard drive.
这篇第一篇文章将专注于硬盘固件的转储、分析和修改。这篇文章中的所有内容都是在没有 AI 的帮助下完成的。在下一篇文章中,我将介绍我如何使用 AI 在其他硬盘/固态硬盘上进行类似的工作,以及如何使用它对一个未知的 ISA 进行黑盒逆向工程,并让 Claude 访问我的硬盘进行调试。
Background 背景
The bug I was trying to exploit was a race condition that occurs when the console is reading data from the HDD. I needed a certain amount of time between when the read request was issued and when the drive replied in order for my exploit to trigger successfully. At the time I didn’t quite understand all the variables at play and was having difficulty exploiting the race condition in the time it took the HDD to respond. One of my initial ideas was to modify the HDD firmware to introduce a delay of a few hundred milliseconds when a specific sector is read from the drive, which would give enough time for the exploit to trigger successfully.
我试图利用的漏洞是在控制台从硬盘读取数据时发生的竞态条件。我需要在发出读取请求和驱动器回复之间有特定的时间间隔,以便我的漏洞能够成功触发。当时我并没有完全理解所有涉及的变量,并且难以在硬盘响应的时间内利用这个竞态条件。我最初的设想之一是修改硬盘固件,在从驱动器读取特定扇区时引入几百毫秒的延迟,这将给漏洞足够的时间来成功触发。
Over the years I had read a few posts/articles about modifying HDD firmware but nothing I could pick up and run with. Regardless, I knew this concept wasn’t new and I just needed to find a drive that was easy to start messing with. At this point in time I just needed one HDD I could use to finish developing the Xbox 360 exploit and then I’d worry about trying to expand the firmware modifications to other makes and models. As it would later turn out I found other ways to dial in my race condition attack and ended up not needing to modify the HDD firmware at all.
多年来,我读过一些关于修改硬盘固件的帖子/文章,但没有一个能直接上手运行。不过,我知道这个概念并不新鲜,我只需要找到一个容易开始折腾的硬盘。在这个时间点上,我只需要一个硬盘来完成开发 Xbox 360 漏洞,然后再考虑将固件修改扩展到其他品牌和型号。后来事实证明,我找到了其他方法来调整我的竞争条件攻击,最终根本没有需要修改硬盘固件。
The idea of modifying the firmware on a HDD/SSD is very interesting to me especially from attacker and pen-testing points of view. However, I’ve never cared to venture down this rabbit hole until now because embedded devices are typically very complex under the hood and massive time-sucks to reverse engineer. Do you know how a hard drive works? At a high level sure, discs spin at high speed, magnets pull data off them, but do you really understand how they work at a micro-controller level?
修改硬盘/固态硬盘固件的想法对我来说非常有趣,尤其是在攻击者和渗透测试的角度来看。然而,我以前从未想过要深入这个领域,因为嵌入式设备通常在底层非常复杂,逆向工程需要耗费大量时间。你知道硬盘是如何工作的吗?从宏观角度来看当然知道,磁盘高速旋转,磁铁读取数据,但你真的理解它们在微控制器层面的工作原理吗?
I had no idea how a hard drive worked internally but I knew I had found another bug where failure to exploit it was not an option I was willing to accept. If the only thing standing in my way of exploiting this bug was a hard drive then this hard drive was going down.
我不知道硬盘内部是如何工作的,但我清楚我找到了另一个漏洞,而利用这个漏洞对我来说不是可以选择放弃的事情。如果阻碍我利用这个漏洞的唯一因素是硬盘,那么这块硬盘就得消失了。
The Test Subjects 测试对象
For this exploit I just needed any HDD or SSD I could easily obtain, modify, and reflash the firmware on. However, I was primarily focused on the brands of HDDs that were used for the Xbox 360 as anyone using the exploit would most likely already have one on hand. I also grabbed some Western Digital drives as I knew from some past endeavors that they have some backdoor vendor commands which could be used to get low level access to them. And lastly I grabbed a couple Samsung SSDs as I had a few of these on hand. Here are the brave test subjects that would (hopefully) survive whatever experiments I was about to do to them:
对于这个漏洞,我只需要任何我可以轻松获取、修改并重新刷新固件的 HDD 或 SSD。然而,我主要关注的是用于 Xbox 360 的 HDD 品牌,因为任何使用漏洞的人最有可能已经手头有一个。我还抓了一些西部数据驱动器,因为我从过去的经历中知道它们有一些后门供应商命令,可以用它们来获取低级访问权限。最后,我还抓了几块三星 SSD,因为我手头有几个这样的。以下是那些(希望)能经受住我即将对它们进行的各种实验的勇敢测试对象:

Here are the makes and models for anyone interested:
以下是任何感兴趣的人的车型和型号:
- Samsung HM020GI 三星 HM020GI
- Hitachi HTS545032B9A300 日立 HTS545032B9A300
- Western Digital WD3200BEVT
西部数据 WD3200BEVT - Samsung PM871a 三星 PM871a
You may notice one of the drives has been thoroughly shamed and that’s due to some past grief where it was coincidentally used on a failing USB adapter and then on a computer with a failing SATA channel, and you can kinda see where this is going… The drive is in fact fully functional.
你可能会注意到其中一个驱动器被彻底羞辱了,这源于一些过去的痛苦经历,它恰好被用于一个故障的 USB 适配器,然后又用于一个具有故障 SATA 通道的计算机,你可以大致看出事情会如何发展……实际上,这个驱动器是完全功能正常的。
Spinning Up 正在启动
The first thing I did was research these drive models online to see if I could find firmware dumps and any information from others who’ve been down this path before. I spent quite a bit of time reading through the HDD Guru forums and found a lot of information on Western Digital (WD) drives and the Hitachi drive. I also found a blog series by MalwareTech where he tries to modify the firmware on a HDD and one part in particular really resonated with me as I was doing this research:
我做的第一件事是在网上研究这些驱动型号,看看是否可以找到固件转储以及其他已经走过这条路的人的信息。我花了很多时间阅读 HDD Guru 论坛,并找到了很多关于西部数据(WD)驱动器和日立驱动器的信息。我还找到了 MalwareTech 的一篇博客系列,他在其中尝试修改硬盘的固件,在我进行这项研究时,其中的一部分尤其让我产生共鸣:
Before i started hacking I’d decided to read other people’s research to get a good idea of where to start. Resourceful, right? Well it actually turns out that most of the research I’ve based mine on was either wrong or just doesn’t apply to this hard disk.
在我开始黑客行为之前,我决定阅读其他人的研究来获得一个好的起点。资源丰富,对吧?好吧,结果发现我基于的大部分研究要么是错误的,要么就是不适用于这个硬盘。
This was my exact experience, most of the information I found were 15+ year old forum posts that were either incorrect or didn’t apply to the model HDD I had. However, there were lots of “pieces” of information I was gathering that together were starting to form a larger picture. My plan of attack for each drive was as follows:
这是我的确切经历,我找到的大部分信息都是 15 年以上的论坛帖子,要么是错误的,要么不适用于我拥有的模型硬盘。然而,我收集到的许多“碎片”信息开始逐渐形成一个更大的画面。我对每个硬盘的攻击计划如下:
- Obtain a firmware dump either by finding the firmware image online or dumping it from the drive myself.
通过在线找到固件镜像或从硬盘本身转储来获取固件转储。 - Getting the firmware loaded into IDA for analysis, this would also include working around any compression or encryption I encountered. If I couldn’t analyze the firmware there was basically no way I was gonna be able to make modifications for it.
将固件加载到 IDA 进行解析,这也包括处理我遇到的任何压缩或加密。如果无法分析固件,基本上就没有办法对其进行修改。 - Finding a way to flash back modified firmware. This could either be through manually programming a flash chip on the HDD logic board or using standard/backdoor commands. Not being able to write back modified firmware was an immediate show stopper for that drive.
找到一种方法来刷回修改后的固件。这可以通过手动编程 HDD 逻辑板上的闪存芯片或使用标准/后门命令来实现。无法写入修改后的固件对该驱动器来说是一个直接的障碍。 - Analyze the firmware and try to find the code responsible for handling read requests. The command I’m interested in is the DMA READ EXT command which is what the console uses for read requests. Somewhere in the firmware there’s likely some sort of command handler function table that would get used to handle the various ATA commands the drive supports. Finding this table would either lead me directly to the DMA READ EXT command handler, or give me a starting place to trace around and find it. This would likely be the hardest part of the process.
分析固件并尝试找到处理读取请求的代码。我感兴趣的是 DMA READ EXT 命令,这是主机用于读取请求的命令。固件中可能存在某种命令处理函数表,用于处理驱动器支持的多种 ATA 命令。找到这个表会直接让我找到 DMA READ EXT 命令处理程序,或者给我一个起点去追踪并找到它。这可能是整个过程中最困难的部分。 - Write some patches to introduce a delay of a few hundred milliseconds when a specific sector is read from the drive.
编写一些补丁,在从驱动器读取特定扇区时引入几百毫秒的延迟。 - Flash the now modified firmware back to the drive.
将修改后的固件刷新回驱动器。
Obtaining the Drive Firmware 获取驱动器固件
One of the things I found on the HDD Guru forums was a section where people uploaded firmware dumps of various HDDs obtained with a PC-3000. If you’re not familiar with PC-3000 it’s a professional grade data recovery tool that uses proprietary vendor commands to diagnose and repair HDDs, as well as dump firmware from them. I was able to find a firmware dump for the Western Digital (WD) drive on the forums, and while posting about this on twitter someone reached out and was able to use a PC-3000 they had access to and get a firmware dump of the Samsung HM020GI. I found the firmware for the Samsung PM871a SSD on the Lenovo website as part of a firmware update utility and this actually turned out to be a double whammy. Not only did I get the firmware image but by reverse engineering the update utility I could figure out the commands needed to flash new firmware to the drive. I never found the firmware for the Hitachi drive but I had enough to start with for now.
我在 HDD Guru 论坛上发现了一个部分,人们在那里上传了使用 PC-3000 获取的各种硬盘固件转储。如果你不熟悉 PC-3000,它是一款专业级的数据恢复工具,使用专有的供应商命令来诊断和修复硬盘,以及从它们那里转储固件。我能够在论坛上找到 WD 硬盘的固件转储,而在推特上发布这个消息时,有人联系了我,并且能够使用他们能访问到的 PC-3000,获取了 Samsung HM020GI 硬盘的固件转储。我在联想网站上找到了 Samsung PM871a SSD 的固件,作为固件更新工具的一部分,这实际上是一个双重打击。我不仅得到了固件镜像,而且通过逆向工程更新工具,我能够弄清楚向硬盘刷写新固件所需的命令。我从未找到日立驱动器的固件,但现在我已经有了足够的开始使用。
Western Digital 西部数据
Starting with the WD drive, I found a few bits of information on the format the firmware image on the HDD Guru forums, and after looking at it in a hex editor for a few minutes I was able to work out the following:
从 WD 驱动器开始,我在 HDD Guru 论坛上找到了一些关于硬盘固件镜像格式的信息,花了几分钟用十六进制编辑器查看后,我能够得出以下结论:
Structure of the firmware image
固件镜像结构
The format is very straight forward, essentially just a list of statically based executable/data sections in a flat file starting with the section headers. Additionally, each section header and data block have their own checksum (8-bit summation) to verify data is valid/copied correctly. I wrote a quick IDA loader plugin so I could load the firmware image and start analyzing it and that’s when I realized that all of the sections in this image except for the first one were compressed. Once of the “pieces” of info I found on the forums was that the first section is a loader stub that’s used by the MCU bootloader to decompress and load the remaining sections into memory. However, the poster declined to mention what the compression algorithm was.
格式非常直接,本质上只是一个以节头开始、包含静态基础可执行/数据段的平面文件列表。此外,每个节头和数据块都有自己的校验和(8 位求和)以验证数据是否有效/正确复制。我编写了一个快速的 IDA 加载插件,以便我可以加载固件镜像并开始分析它,就在那时我意识到这个镜像中的所有节除了第一个都是经过压缩的。我在论坛上找到的“信息片段”之一是,第一个节是一个由 MCU 引导加载器用于解压缩并将剩余节加载到内存中的加载器存根。然而,发帖人拒绝提及压缩算法是什么。
At first I ran one of the compressed blocks through some tools to try and identify what the compression algorithm was but this didn’t return any fruitful results. So I loaded the first section into IDA as ARM code and started reverse engineering how it worked. A lot of the MCUs powering HDDs/SSDs are ARM based, and many of them have multiple cores that are responsible for doing different things. Luckily this WD HDD only has one ARM core which makes it a bit easier to work with. After a few minutes of reverse engineering I was able to label most of the section loading loop and identify the function responsible for decompressing data:
最初我尝试用一些工具对其中一个压缩块进行分析,试图识别压缩算法是什么,但这并没有得到任何有成果的结果。所以我将第一部分加载到 IDA 中作为 ARM 代码,开始逆向工程它是如何工作的。很多为 HDD/SSD 供电的微控制器是基于 ARM 的,而且它们中的许多都有多个核心负责做不同的事情。幸运的是,这个 WD 硬盘只有一个 ARM 核心,这使得它稍微容易一些处理。经过几分钟的逆向工程,我能够标记出大部分的加载循环部分,并识别出负责解压缩数据的函数:
Disassembly of section loading function
反汇编加载函数部分
I spent a bit of time reverse engineering the decompression routine and eventually got a working reimplementation of it. The algorithm is LZHUF but there’s a couple changes made to it which is why I wasn’t able to detect it when I ran one of the compressed blocks through an identification utility. The N constant was changed from 2048 to 4096, and the run length calculation now subtracts THRESHOLD instead of adding it:
我花了一些时间逆向工程解压缩例程,最终得到了一个可工作的重新实现。该算法是 LZHUF,但有几点改动,这就是为什么我在通过一个识别工具运行其中一个压缩块时没能检测到它。N 常量从 2048 改为 4096,运行长度计算现在减去 THRESHOLD 而不是加上它:
Example of modifications to LZHUF algorithm.
LZHUF 算法修改示例。
After updating my IDA loader script I was able to load the entire firmware image with all the sections at the correct base addresses. The WD firmware was now ready for analysis.
更新我的 IDA 加载器脚本后,我能够以正确的基地址加载整个固件镜像以及所有分区。WD 固件现在可以进行分析了。
Samsung PM871a 三星 PM871a
Next up is the Samsung PM871a which I was able to find the firmware and a firmware update utility for on Lenovo’s website. This actually turned out to be a great strategy, search for firmware update utilities on OEM websites which gets you the firmware and firmware update utility that will:
接下来是三星 PM871a,我在联想网站上找到了它的固件和固件更新工具。这实际上证明了一种很好的策略:在 OEM 网站上搜索固件更新工具,这样你可以获得能够:
- Decrypt/deobfuscate the firmware if it’s protected.
如果固件受保护,则解密/去混淆固件的固件和固件更新工具。 - Flash it to the HDD.
将其刷入硬盘。
The firmware for Samsung SSDs are typically encrypted/obfuscated in some way and the firmware update utilities will usually decrypt/deobfuscate the firmware before sending it to the drive for flashing. For the particular SSD I was working with the firmware was obfuscated using some bit fiddling algorithm that I was able to reverse engineer out of the firmware update utility:
三星 SSD 的固件通常以某种方式加密/混淆,而固件更新工具通常会在将其发送到驱动器进行刷新之前对其进行解密/去混淆。我正在处理的特定 SSD 的固件使用了一种位操作算法进行混淆,我能够从固件更新工具中逆向工程出这种算法:
void DecodeFirmware(unsigned char* pBuffer, unsigned int Length)
{// Loop through the entire firmware buffer.for (unsigned int i = 0; i < Length; i++){// Get the hi nibble for the current byte.unsigned char nibbleHi = (pBuffer[i] >> 4) & 0xF;// Do bit twiddling?if ((nibbleHi & 1) != 0)nibbleHi >>= 1;elsenibbleHi = 0xF - (nibbleHi >> 1);// Mask in the new hi nibble value.pBuffer[i] = (pBuffer[i] & 0xF) | (nibbleHi << 4);}
}
This particular firmware update utility can be used for a little over 2 dozen different Samsung SSD models (as well as a few dozen different DVD drive models) so harvesting information from these utilities is extremely fruitful when it comes to scaling research endeavors outwards. Next I opened the deobfuscated firmware image in a hex editor to get an idea of what I was working with. The first thing I noticed was some suspicious data in the first few kilobytes of the file that looked like it could be some sort of cryptographic signature (meaning I might not be able to modify the firmware). However, after comparing two firmware files it appeared that other than a few small changes in data/code the only notable difference was some 28 byte run at the very beginning of the file:
这个特定的固件更新工具可用于超过 2 打不同的三星 SSD 型号(以及几十种不同的 DVD 驱动器型号),因此从这些工具中获取信息对于扩展研究工作非常有价值。接下来我用十六进制编辑器打开了解密后的固件镜像,以了解我在处理什么。我注意到的第一件事是文件前几 KB 中的一些可疑数据,看起来像是一种加密签名(这意味着我可能无法修改固件)。然而,在比较两个固件文件后,除了数据/代码中的一些小变化外,唯一值得注意的区别是文件开头的 28 字节运行:
Comparing of the two firmware files
比较这两个固件文件
While it wasn’t conclusive this was a good indication that the firmware file most likely wasn’t signed with some strong public-key crypto algorithm like RSA or ECDSA. It’s also worth mentioning that the two firmware files are actually the same version but for different form factors of the Samsung PM871a, one for the 2.5″ SATA model and one for the M.2 model. So despite there being a few changes throughout the file it’s still possible some portions of it are in fact signed. The 28 byte run is an odd length for a hash or signature, it could be SHA-224 or truncated SHA-256, but we can worry about that later.
虽然这并不能得出结论,但这是一个好迹象,表明固件文件很可能没有使用一些强大的公钥加密算法,如 RSA 或 ECDSA 进行签名。还值得一提的是,这两个固件文件实际上是同一版本,但适用于三星 PM871a 的不同规格,一个是用于 2.5 英寸 SATA 型号的,另一个是用于 M.2 型号的。所以尽管文件中有些变化,但仍然有可能其中一些部分实际上是经过签名的。28 字节的运行长度对于哈希或签名来说很奇怪,可能是 SHA-224 或截断的 SHA-256,但我们可以稍后再考虑这个问题。
Next step was trying to identify where (if at all) the section headers were located. After loading the file into IDA as a binary file I could immediately tell the file was split into sections with different base addresses so we’ll need to find the section headers to get this loaded properly. The first 8 kilobytes of the file appear to be some meta data blocks and after that we can see this:
下一步是尝试识别(如果存在的话)节标题的位置。将文件作为二进制文件加载到 IDA 后,我可以立即看出文件被分割成具有不同基址的多个节,因此我们需要找到节标题来正确加载。文件的前 8 千字节似乎是一些元数据块,之后我们可以看到:
Hex dump of the suspected segment descriptors
十六进制转储的疑似段描述符
The bytes in red are most definitely memory addresses for the code/data segments, how do I know? Because I’ve been staring into hex editors the abyss for over 20 years now and I’ve gotten incredibly good at picking apart binary file formats. Additionally these addresses line up pretty nicely with the memory map for ARM Cortex-M3 cores:
红色的字节绝对是指令段/数据段的内存地址,我如何知道呢?因为我已经花了 20 多年时间盯着十六进制编辑器看深渊,并且我非常擅长分析二进制文件格式。此外,这些地址与 ARM Cortex-M3 核心的内存映射非常吻合:
ARM Cortex-M3 memory map ARM Cortex-M3 内存映射
After a few more minutes of staring into the abyss I was able to work out that the offset and size of each section are in intervals of 16KB blocks, and after writing another IDA loader script I had the firmware fully loaded and ready for analysis.
在盯着深渊又过了几分钟之后,我意识到每个部分的偏移量和大小都是以 16KB 块为间隔的,并且在编写了另一个 IDA 加载器脚本之后,固件已经完全加载并准备好进行分析。
Samsung HM020GI 三星 HM020GI
Lastly was the Samsung HM020GI. Looking at the firmware dump for this drive in a hex editor I could clearly see plain text strings and what looked like machine code. However, despite my best attempts I could not find any architecture that this code would disassemble for. One thing that stuck out was that the entire file appear to be word flipped:
最后是三星 HM020GI。在十六进制编辑器中查看这个驱动器的固件转储文件时,我能清楚地看到明文字符串,以及看起来像机器代码的东西。然而,尽管我尽力尝试,但我无法找到任何这段代码可以反汇编的架构。有一件事很突出,那就是整个文件似乎都是翻转的:
Byte flipped firmware data
字节翻转的固件数据
This was starting to seem like it could be some extremely esoteric ISA or possibly even custom byte code that would get run through a VM baked into the MCU. For now I decided to put this HDD aside, but we’ll come back to it in part 2 so stay tuned!
这看起来像是某种非常深奥的 ISA,或者甚至可能是需要通过集成在 MCU 中的虚拟机运行的定制字节码。目前我决定暂时将这个硬盘放一边,但在第 2 部分我们会再回过头来讨论它,请保持关注!
Flashing Modified Firmware 刷写修改后的固件
From this point on I’m only going to cover my work with the Western Digital HDD as this is the one I spent the most time on. It’s also not very interesting for you as a reader if I’m just repeating the same steps several times over for each drive I worked on. The other drives will make a come back in part 2 where I’ll cover unique research I did on each one but for now we’re going to set them aside.
从这一点开始,我将只涵盖与西部数据硬盘相关的工作,因为这是我花费最多时间在上的。对于你这样的读者来说,如果我只是反复重复每个我工作过的硬盘的相同步骤,那也不会很有趣。其他硬盘将在第二部分中回归,在那里我将涵盖我对每个硬盘进行的独特研究,但目前我们将把它们放在一边。
There’s 3 main ways to write new firmware to a drive:
将新固件写入驱动器有三种主要方法:
- Using the DOWNLOAD MICROCODE ATA command, this is the most common way and is supported by most (if not all?) drives.
使用 DOWNLOAD MICROCODE ATA 命令,这是最常见的方法,并且被大多数(如果不是全部)驱动器支持。 - Back door vendor commands, typically used as a means of repair/diagnostics or by drives that primarily use service area overlays as a means of patching/updating firmware.
后门供应商命令,通常用作维修/诊断的一种手段,或由主要使用服务区域覆盖作为修补/更新固件方式的驱动器使用。 - Through a serial interface exposed on the drive circuit board, typically used as a means of repair/diagnostics.
通过驱动器电路板上的串行接口,通常用作维修/诊断的一种手段。
DOWNLOAD MICROCODE Command (DOWNLOAD MICROCODE 命令)
All HDDs and SSDs communicate with a host device using some version of the ATA specification which outlines all the commands and responses used during communications. These include commands to read and write data, query and set drive information, etc. One such command is the download microcode command which is used to upload new code/firmware to the drive. You’ll typically send the command along with some additional register values to indicate the firmware size, stream the new firmware to the drive (either in chunks or through a DMA transfer), and then the drive will potentially validate the firmware received and write it to non-volatile storage. If the firmware update was successful you can power cycle the drive and you’re done, if it failed then you may have a new paper weight. You can usually recover from this but it requires varying levels of hacking on the drive and is not something a normal user would be able to do.
所有 HDD 和 SSD 都使用 ATA 规范的某个版本与主机设备通信,该规范概述了通信过程中使用的所有命令和响应。这些命令包括读取和写入数据、查询和设置驱动器信息等。其中一种命令是下载微码命令,用于将新代码/固件上传到驱动器。通常情况下,您会发送该命令,并附带一些额外的寄存器值来指示固件大小,将新固件流式传输到驱动器(可以是分块传输或通过 DMA 传输),然后驱动器可能会验证接收到的固件并将其写入非易失性存储器。如果固件更新成功,您可以重新启动驱动器,这样就完成了;如果失败,您可能只能得到一张废纸。通常可以从这种状态恢复,但这需要对驱动器进行不同程度的“黑客”操作,而普通用户通常无法完成。
Most (if not all) drives should support this command and this is how most modern drives do firmware updates. All those firmware update utilities you can find on OEM websites are just streaming the new firmware through this command to the drive.
大多数(如果不是全部)驱动器都应支持此命令,这也是大多数现代驱动器进行固件更新的方式。你可以在 OEM 网站上找到的所有固件更新工具都只是通过此命令将新固件流式传输到驱动器。
Back Door Vendor Commands 后门供应商命令
Many Western Digital HDDs never received full firmware updates and instead relied on writing new code to the overlay modules located in the service area of the platters. The service area is a special area on the platters that’s normally inaccessible and contains information such as drive configuration data (model/serial info, geometry data, etc), SMART status data, and updated bits of firmware code. These are typically referred to as “modules” or “overlays” and there’s dozens of them located in the service area. Western Digital drives in particular like to use some of these for updated bits of code that get loaded into memory when the drive boots up.
许多西部数据(Western Digital)的硬盘从未收到完整的固件更新,而是依赖于向位于盘片服务区域的覆盖模块写入新代码。服务区域是盘片上的一个特殊区域,通常无法访问,其中包含驱动器配置数据(型号/序列信息、几何数据等)、SMART 状态数据以及固件代码的更新位。这些通常被称为“模块”或“覆盖层”,在服务区域中有数十个。特别是西部数据的驱动器喜欢使用其中一些来加载在驱动器启动时进入内存的代码更新位。
PC-3000 software showing service area modules for a WD HDD
PC-3000 软件显示 WD 硬盘的服务区域模块
To access these areas there’s some back door commands you can issue to the drive that’ll let you read and sometimes write to these modules. Western Digital uses the SMART READ/WRITE LOG command which is normally used to read (and seldomly write) SMART status values. These commands take a parameter called the log address which is an 8-bit value that indicates which SMART data page you want to read/write. Many of these have predefined purposes per the ATA spec but there’s a range that’s “vendor defined” that can be used to access all sorts of repair and diagnostics commands as we’ll see later on:
要访问这些区域,您可以向驱动器发出一些后门命令,这些命令将允许您读取这些模块,有时还可以写入。西部数据使用 SMART READ/WRITE LOG 命令,该命令通常用于读取(且很少写入)SMART 状态值。这些命令需要一个称为日志地址的参数,该参数是一个 8 位值,指示您要读取/写入哪个 SMART 数据页。根据 ATA 规范,其中许多命令具有预定义的目的,但有一个“供应商定义”的范围,可用于访问各种修复和诊断命令,我们将在后面看到:
ATA specification for log address values.
ATA 规范日志地址值。
Physical Serial Interface 物理串行接口
Many HDDs have a physical serial interface that’s exposed via 4 pins next to the SATA connector which you’ve probably seen before. Those pins are for a RS232 serial port that you can connect to and issue repair/diagnostic commands to the drive. Each drive manufacturer/model will have different commands they support, some of which you can find documentation online from other reverse engineers, others you’ll need to actually dig into the firmware to figure out. For now we’re going to ignore this serial port and come back to it in part 2.
许多硬盘都有一个物理串行接口,通过 SATA 连接器旁边的 4 个引脚暴露出来,你可能之前见过。这些引脚用于 RS232 串行端口,你可以连接到该端口并向硬盘发出修复/诊断命令。每个硬盘制造商/型号支持不同的命令,其中一些可以在网上找到其他逆向工程师的文档,而另一些则需要实际深入固件才能弄清楚。目前我们将忽略这个串行端口,并在第二部分再回过头来讨论。
HDD serial port connection
HDD 串行端口连接
The Western Digital SPI Flash 西部数据 SPI 闪存
My plan for reflashing the WD drive was to use the back door vendor commands to write my modified firmware image. I had already prepared a python script that could unpack, patch, and repack the firmware image, now I just needed to write a tool to write the image to the drive. However, my fear was that even if I can write new firmware successfully there’s a good chance I’m going to botch these patches on the first few attempts and could very well put the drive into a state where I couldn’t use the back door commands to reflash and recover. I was really going to need a robust recovery solution or else I could quickly brick a dozen or more drives before I make any real progress.
我的计划是用后门供应商命令来刷新 WD 驱动器,写入我修改过的固件镜像。我已经准备了一个 Python 脚本,可以解包、修补和重新打包固件镜像,现在只需要编写一个工具将镜像写入驱动器。然而,我的担心是,即使我能成功写入新固件,也有很大可能在最初的几次尝试中搞砸这些补丁,并很可能将驱动器置于无法使用后门命令重新刷新和恢复的状态。我确实需要一个强大的恢复解决方案,否则在取得任何实质性进展之前,我可能会很快弄坏一打或更多的驱动器。
Luckily the WD drives have two locations where the primary firmware image is stored. The first is in internal MCU flash which is what the WD drive I had was using. The second is a SPI flash chip that’s found on certain models which overrides the internal MCU flash. Even though the model I had didn’t have the SPI flash chip populated the pads for it still existed and could be used by soldering in the SPI flash chip and a couple resistors to signal to the MCU to boot from it instead of the internal flash.
幸运的是,WD 驱动器有两个存储主固件镜像的位置。第一个是内部 MCU 闪存,这是我拥有的 WD 驱动器正在使用的。第二个是一个 SPI 闪存芯片,在某些型号上可以找到,它会覆盖内部 MCU 闪存。尽管我拥有的型号没有安装 SPI 闪存芯片,但用于它的引脚仍然存在,可以通过焊接一个 SPI 闪存芯片和几个电阻来使用,以向 MCU 发出信号从它而不是内部闪存启动。
Location of the SPI flash chip.
SPI 闪存芯片的位置。
My plan was to use the SPI flash chip for testing modified firmware and if at any point I flashed something that prevented the drive from booting or being able to reflash new images, I could use an external programmer and reflash it in-circuit. I ordered a couple suitable SPI flash chips and then moved on to analyzing the firmware while I waited for them to arrive.
我的计划是使用 SPI 闪存芯片来测试修改后的固件,如果在任何时候我刷写了一些导致驱动无法启动或无法重新刷写新镜像的内容,我可以用外部编程器在线重新刷写它。我订购了几块合适的 SPI 闪存芯片,然后在等待它们到来的同时开始分析固件。
Analyzing the Firmware 分析固件
The next step was going to be the hardest part of this endeavor, finding the code responsible for handling read requests. This firmware is very low level stuff, there’s no strings or useful hints I can derive information from, and it’s split between many different memory segments some of which exist outside of the firmware image. In order to find this bit of code I’m going to have to get very creative here.
接下来的步骤将是最困难的部分,找到处理读取请求的代码。这块固件是非常底层的代码,我无法从中推导出任何信息,比如字符串或有用的提示,而且它分布在许多不同的内存段中,其中一些位于固件映像之外。为了找到这段代码,我需要在这里非常富有创造力。
You Ever Debug a Hard Drive Before? 你以前调试过硬盘吗?
One of the nice things about the WD drives is that most of them expose a JTAG connection in the form of an unpopulated 38-pin MICTOR connector on the board. This means I can easily solder up a few wires and get hardware level debugging access to the MCU running the HDD. Yes you heard me right, we’re going to debug a live HDD.
WD 驱动器的一个好处是,它们中的大多数在板上通过一个未使用的 38 引脚 MICTOR 连接器暴露出 JTAG 连接。这意味着我可以轻松地焊接几根线,并获得运行 HDD 的 MCU 的硬件级调试访问权限。是的,你没听错,我们要调试一个正在运行的 HDD。
JTAG wires connected to the HDD circuit board.
JTAG 线连接到硬盘电路板。
Being able to debug the HDD will prove invaluable as I’ll be able to set breakpoints in code, inspect memory, register state, and step through execution while sending the drive commands from a PC. However, there’s a few pain points I’ll have to work through while doing this:
能够调试硬盘将非常有价值,因为我将能够在代码中设置断点、检查内存、寄存器状态,并在从 PC 发送驱动器命令时逐步执行。然而,在进行此操作时,我必须解决一些痛点:
- The HDD needs to be connected to a PC so I can send it ATA commands and see if my breakpoints trigger. Unfortunately I don’t have a USB adapter that supports ATA passthrough so it has to be connected directly to the PC via SATA.
硬盘需要连接到 PC 上,这样我才能发送 ATA 命令并检查我的断点是否触发。不幸的是,我没有支持 ATA 直通功能的 USB 适配器,所以它必须通过 SATA 直接连接到 PC 上。 - If the drive doesn’t respond back within some timeout interval Windows thinks the drive is MIA and all subsequent communications will fail until you reboot Windows. On some versions of Windows this will actually cause volmgr to BSOD the machine.
如果驱动器在超时间隔内没有响应,Windows 会认为驱动器丢失,并且所有后续通信都会失败,直到你重启 Windows。在某些版本的 Windows 中,这实际上会导致 volmgr 使机器蓝屏死机。 - The HDD is a bit temperamental when being debugged and sometimes you’ll put it into a state where it needs to be power cycled before it’ll function correctly.
在调试过程中,硬盘有点不稳定,有时你会把它置于需要重新上电才能正常工作的状态。
This was the first time I ever messed with JTAG debugging so the entire process was a learning experience as I tried to find my way around. After a bit of trial and error getting OpenOCD setup with my FT232 I was able to get connected and break in:
这是我第一次接触 JTAG 调试,所以整个过程对我来说是一次学习经历,我努力摸索着前进。在经过一些尝试和错误后,我成功设置了 OpenOCD 与我的 FT232,并成功连接和调试。
OpenOCD connected to the HDD MCU.
OpenOCD 已连接到 HDD MCU。
Fwiw I loosely based my tap setup around what MalwareTech found in his experiments but the MCU used on the HDD I’m working with is slightly different. I don’t know if it has multiple ARM cores as I was only ever able to get the first core identified. Regardless, it was time for the next step which was writing a quick tool to send some commands to the drive.
Fwiw 我松散地基于 MalwareTech 在他的实验中发现的内容来构建我的 tap 设置,但我正在处理的老式硬盘上使用的 MCU 略有不同。我不知道它是否有多个 ARM 核心,因为我只能够识别第一个核心。无论如何,是时候进行下一步了,即编写一个快速工具来向硬盘发送一些命令。
Vendor Specific Commands 特定厂商命令
Earlier I briefly mentioned Western Digital drives had some backdoor commands you can access through the SMART READ/WRITE LOG ATA command. These are called vendor specific commands (VSCs) which can do things like read/write firmware, RAM, overlay modules, and other repair/diagnostics related things. The one that’s most interesting right now is the read RAM command. My plan of attack is to set a memory breakpoint on some address, say 0x41414141, then issue the read RAM command with address=0x41414141 which should trigger the breakpoint. From there I can see what function is handling this VSC command (and thus the SMART READ LOG ATA command), and walk up the call stack to hopefully find some common dispatcher that’ll lead me to the function that handles read requests.
之前我简要提到西部数据硬盘有一些可以通过 SMART READ/WRITE LOG ATA 命令访问的后门命令。这些被称为厂商特定命令(VSCs),可以执行诸如读取/写入固件、RAM、覆盖模块以及其他与修复/诊断相关的事情。目前最有趣的是读取 RAM 命令。我的计划是设置一个内存断点在某地址,比如 0x41414141,然后发出地址=0x41414141 的读取 RAM 命令,这应该会触发断点。从那里我可以看到处理这个 VSC 命令(以及 SMART READ LOG ATA 命令)的函数是什么,并向上遍历调用栈,希望能找到一个常见的调度器,从而引导我找到处理读取请求的函数。
To send one of these VSCs you simply setup an ATA passthrough request which is a way to send a low level command directly to the hard drive. You fill out a structure with the values you want programmed into the ATA port registers and the HDD/SSD will attempt to perform that command. You can additionally send some extra data that’s some N sectors large. To issue a VSC we’ll setup the registers to perform a SMART WRITE LOG ATA command with the log page set to 0xBE (which is a vendor defined log page number and what WD drives use for the “backdoor”). Then we provide 1 sector worth of extra data that contains the VSC ID and any additional parameters it takes. For the read RAM command we’ll also provide the memory address and size we want read.
要发送其中一个 VSC,您只需设置一个 ATA 直通请求,这是一种将低级命令直接发送到硬盘的方式。您填写一个结构,其中包含要编程到 ATA 端口寄存器的值,硬盘/SSD 将尝试执行该命令。您可以额外发送一些数据,大小为 N 个扇区。要发出 VSC,我们将设置寄存器以执行 SMART WRITE LOG ATA 命令,并将日志页设置为 0xBE(这是一个供应商定义的日志页码,WD 驱动程序用于“后门”)。然后我们提供 1 个扇区的额外数据,其中包含 VSC ID 和它所需要任何其他参数。对于读 RAM 命令,我们还将提供要读取的内存地址和大小。
Illustration of an ATA passthrough command.
ATA passthrough 命令的插图
Here’s the C code for the function I wrote to send this VSC:
这是我为发送这个 VSC 编写的 C 代码函数:
bool SendVSCAccessKey(HANDLE hDrive, BYTE bVscCmd, bool bWriteAccess, DWORD dwAddress = 0, DWORD dwSize = 0)
{DWORD BytesRead = 0;DWORD BufferSize = sizeof(ATA_PASS_THROUGH_EX) + 512;BYTE abPassthroughData[sizeof(ATA_PASS_THROUGH_EX) + 512] = { 0 };ATA_PASS_THROUGH_EX* pAtaPassthrough = (ATA_PASS_THROUGH_EX*)abPassthroughData;VSC_COMMAND_DATA* pVscCommand = (VSC_COMMAND_DATA*)(pAtaPassthrough + 1);// Setup the passthrough data:pAtaPassthrough->Length = sizeof(ATA_PASS_THROUGH_EX);pAtaPassthrough->AtaFlags = ATA_FLAGS_DATA_OUT;pAtaPassthrough->TimeOutValue = 5;pAtaPassthrough->DataTransferLength = 512;pAtaPassthrough->DataBufferOffset = sizeof(ATA_PASS_THROUGH_EX);// Setup port registers for SMART READ LOG command:IDEREGS* pRegs = (IDEREGS*)pAtaPassthrough->CurrentTaskFile;pRegs->bCommandReg = ATA_OP_SMART;pRegs->bFeaturesReg = SMART_WRITE_LOG;pRegs->bSectorCountReg = 1;pRegs->bSectorNumberReg = 0xBE; // Special WD log addresspRegs->bCylLowReg = 0x4F;pRegs->bCylHighReg = 0xC2;pRegs->bDriveHeadReg = 0xA0;// Setup the VSC command data:pVscCommand->CommandId = bVscCmd;pVscCommand->Mode = bWriteAccess == true ? VSC_MODE_WRITE : VSC_MODE_READ;pVscCommand->ReadWriteRam.Address = dwAddress;pVscCommand->ReadWriteRam.Length = dwSize;// Send the command to the drive.if (DeviceIoControl(hDrive, IOCTL_ATA_PASS_THROUGH, pAtaPassthrough, BufferSize, pAtaPassthrough, BufferSize, &BytesRead, nullptr) == FALSE){wprintf(L"SendVSCAccessKey failed 0x%08x\n", GetLastError());return false;}// Check for any errors.OUT_REGS* pOutRegs = (OUT_REGS*)pAtaPassthrough->CurrentTaskFile;if ((pOutRegs->bStatusReg & 1) != 0){wprintf(L"SendVSCAccessKey failed drive returned 0x%04x\n", WD_ERROR_CODE(pOutRegs));return false;}return true;
}
With everything setup I set a memory access breakpoint for the address 0x41414141 and then ran my test app to issue the read RAM VSC to the drive. The breakpoint triggered and the debugger broke in:
一切设置完毕后,我在地址 0x41414141 设置了一个内存访问断点,然后运行我的测试应用程序以向驱动器发出读取 RAM VSC。断点被触发,调试器中断在:
GDB output from the breakpoint.
来自断点的 GDB 输出。
Dumping the registers I was able to see the address of the instruction reading from 0x41414141 was at 0xFFE1D780 which was inside of the firmware code.
转储寄存器时,我能看到从 0x41414141 读取指令地址的地方位于 0xFFE1D780 ,而 0xFFE1D780 在固件代码内部。
Into the Belly of the Beast 深入猛兽之腹
After a few minutes of poking around the function that triggered the breakpoint I was able to clearly see how it was loading the parameters from the VSC buffer and performing the memory read operation:
经过几分钟的摸索,我弄清楚了这个触发断点的函数是如何从 VSC 缓冲区加载参数并执行内存读取操作的:
Disassembly for the read RAM command handler.
Tracing up the stack I was able to identify a lookup table for the VSCs which contained 67 entries in total, a bit more than I was expecting to see.
Function handler table for VSCs.
VSC 的函数处理表。
While I was debugging the drive I dumped the first 1Kb of stack data so I could use it to walk up the call stack as there were a few indirect calls via function tables. For the next steps I modified my VSC test app and added a new option to read a specific sector from the drive. Then I placed a couple breakpoints in the functions I identified in the call stack to the _vsc_read_write_memory function and ran the read sector test. However, none of my breakpoints were hitting. After a bunch of trial and error I could see the chain of functions leading up to the VSC handler was only triggering breakpoints for the SMART READ LOG, SMART WRITE LOG, and IDENTIFY commands. The DMA READ EXT command from the read sector test was not triggering anything which meant it was being handled elsewhere.
调试驱动时,我转储了栈数据的前 1KB,以便使用它来遍历调用栈,因为有几个通过函数表进行的间接调用。对于下一步,我修改了我的 VSC 测试应用程序,并添加了一个新选项来从驱动读取特定扇区。然后我在调用栈中标识的函数中放置了一些断点,并运行了读取扇区测试。然而,我的任何断点都没有触发。经过一番反复试验后,我发现从 VSC 处理程序上溯的函数链只触发了 SMART READ LOG、SMART WRITE LOG 和 IDENTIFY 命令的断点。来自读取扇区测试的 DMA READ EXT 命令没有触发任何内容,这意味着它正在其他地方进行处理。
Looking around the functions leading up to the VSC handlers I could see a few memory addresses that were used quite often. They were arrays of data, and searching across the whole firmware image they were used all over the place. I decided to dump these regions of memory to see what was there. I was hoping I’d find data related to the port registers/ATA command being handled which I could use to trace down the code for handling read requests. One of these regions in particular is an array of 40 elements each 16 bytes in size. After poking around the code to see how the fields were laid out I wrote a quick python script to print out each element:
在 VSC 处理程序之前的函数中四处查看时,我发现了一些经常使用的内存地址。它们是数据数组,在整个固件映像中到处都在使用。我决定转储这些内存区域,看看里面有什么。我希望找到与端口寄存器/ATA 命令相关的数据,以便我可以追踪处理读请求的代码。其中一个区域特别是一个包含 40 个元素的数组,每个元素 16 字节大小。在研究了代码中字段是如何排列之后,我写了一个简单的 Python 脚本来打印出每个元素:
Unknown array data printed out.
未知数组数据被打印出来。
After a few minutes of analyzing this data (aka staring into the abyss) and poking at more memory on the drive I was able to determine that the first column is a pointer to some additional data I couldn’t make heads or tails of, and the second column was a function pointer. Looking at each of these function pointers in IDA I could see some of them were in the call stack that leads up to the VSC handler. It seemed like this array was a list of requests to be processed and entries with a valid address in the second column pointed to the function that likely handled the request.
分析这些数据(又名凝视深渊)几分钟,并检查驱动器上的更多内存后,我确定第一列是指向一些我无法理解的额外数据的指针,而第二列是一个函数指针。在 IDA 中查看这些函数指针时,我发现其中一些位于通往 VSC 处理器的调用栈中。似乎这个数组是一个待处理请求的列表,第二列中具有有效地址的条目指向了可能处理该请求的函数。
The next step was to see if requests for sector reads used the function pointer field. I ran the read sector test about 2 dozen times to try and “poison” this request table in hopes it would make it obvious which entries were for the read requests, and sure enough we had a winner:
下一步是查看是否对扇区读取的请求使用了函数指针字段。我大约运行了 20 多次读取扇区测试,试图“毒化”这个请求表,希望这能让读取请求的条目变得明显,果然我们找到了一个目标:
Requests for DMA READ EXT command.
DMA READ EXT 命令的请求。
I placed a breakpoint on this function and reran the read sector test and sure enough it was hitting! There was only one problem, this function wasn’t in any of the address ranges for the firmware image, it was somewhere else…
我在这个函数上设置了断点,然后重新运行了读取扇区测试,果然它被触发了!唯一的问题是,这个函数不在固件映像的任何地址范围内,它在其他地方……
Where the FSCK is this code? FSCK代码在哪里?
Earlier I mentioned there’s a special service area of the platters that’s used to store additional data known as overlay modules, and that Western Digital drives like to use them to store additional code. That’s where the code containing this function lives. When the drive boots up it starts in the MCU bootloader which is responsible for copying a special bootstrap section of the firmware into RAM and executing it. The firmware bootstrap then decompresses/copies all remaining sections in the firmware image into RAM at their respective base addresses. At some later point in the bootup process the overlay modules containing more executable code are loaded into memory.
之前我提到过,磁盘的某个特殊服务区域用于存储称为叠加模块的额外数据,而西部数据硬盘喜欢使用它们来存储附加代码。这就是包含该功能的代码所在的位置。当硬盘启动时,它会从 MCU 引导加载程序开始,该程序负责将固件的特殊引导部分复制到 RAM 中并执行它。然后固件引导程序会将固件镜像中剩余的所有部分解压缩/复制到它们各自的基地址处 RAM 中。在启动过程的某个稍后阶段,包含更多可执行代码的叠加模块被加载到内存中。
Memory map showing locations of firmware and overlay modules.
显示固件和覆盖模块位置的内存映射。
These overlay modules can be dumped using some more VSCs but rather than spend time trying to figure that out I decided to just dump the region of RAM containing the code to a file once the drive was fully booted. Later on I did find out that the overlay module containing the code is number 0x11 but that’s not really important. Now for the last step, writing a patch for this code to introduce a small delay when reading a specific sector.
这些覆盖模块可以使用一些更多的 VSCs 来转储,但我决定在驱动完全启动后,直接将包含代码的 RAM 区域转储到文件中,而不是花时间去弄清楚这一点。后来我发现包含代码的覆盖模块编号是 0x11,但这并不重要。现在,最后一步是为这段代码编写一个补丁,以在读取特定扇区时引入一个小的延迟。
Patching the Firmware修补固件
Since the code I need to patch lives in an overlay module this makes patching it slightly more complicated as it’s more difficult to recover from a bad patch than if it were in external SPI flash. However, now that I could just modify RAM contents using the VSCs my plan was to just hot patch the code in memory until I had working modifications and worry about writing them back to the platter later on.
由于我需要修补的代码位于一个覆盖模块中,这使得修补它稍微复杂一些,因为如果它在外部 SPI 闪存中,从坏的修补中恢复比现在更困难。然而,既然我现在可以使用 VSCs 直接修改 RAM 内容,我的计划是先在内存中热修补代码,直到我有了有效的修改,然后再考虑稍后将其写回硬盘。
With the read sector code dumped and loaded into IDA I began analyzing it and planning out how I was going to hook it. Eventually I’d need to figure out what sector(s) the code is trying to read, as I only want to introduce the delay for one specific sector and not every read operation. For now I decided to worry about that detail later and just get the hook written so I could make sure the PoC for the exploit actually worked, otherwise all this was all pointless. After analyzing the code a bit I found a suitable place to hook:
读取扇区代码已经转储并加载到 IDA 中,我开始分析它并计划如何挂钩它。最终我需要弄清楚代码试图读取哪些扇区,因为我只想对特定扇区引入延迟,而不是每个读取操作。目前我决定以后再考虑这个细节,只要先写好挂钩,确保漏洞的 PoC 实际上能工作,否则所有这些工作都是徒劳的。在分析代码后,我发现了一个合适的挂钩位置:
Disassembly of the read sector functions.
读取扇区功能的拆卸。
Here we can see the main handler function sub_16A5E performs a loop and each iteration calls sub_1671C which actually handles the read request. The SataRequestArray is the 40 element array from earlier that contained the function pointers for each request. This loop starts with the initial read request entry and will continue processing requests in the array so long as the Unk4 field is not 0xFF. I guess large requests can be split up into multiple smaller requests or something, I’m not entirely sure… As for placing a hook I decided to place it a few instructions into sub_1671C as it was a little easier than trying to fit it into the loop body. After a bit of trial and error I ended up with the following assembly code:
在这里我们可以看到主处理函数 sub_16A5E 执行一个循环,每次迭代都会调用 sub_1671C 来实际处理读请求。 SataRequestArray 是之前提到的包含每个请求函数指针的 40 元素数组。这个循环从初始的读请求条目开始,只要 Unk4 字段不是 0xFF,就会继续处理数组中的请求。我猜大请求可以被拆分成多个小请求,或者类似的东西,我不完全确定……至于放置钩子,我决定在 sub_1671C 中的几条指令之后放置,因为它比尝试将其放入循环体中要简单一些。经过一番尝试和错误,我最终得到了以下汇编代码:
.syntax unified# Timing related variables to control the delay:
.set F_CPU, 10000000 # CPU frequency
.set MS_DELAY, 200 # 200ms delay# ============================================================================
# Our code cave
# ============================================================================.long 0xFFEAB600.long (9f - 0f)
0:# Call our hook.push {r0-r7, lr}blx Hook_SataDmaReadpop {r0-r7, lr}# Replace instructions we overwrote.movs r7, r0lsls r0, r0, #4adds r5, r0, r1ldrb r1, [r5, #0xD]sub sp, sp, #0x1Cstr r1, [sp, #0xC]ldrb r0, [r5, #0xE]# Trampoline back.ldr lr, =0x0001672E+1bx lrHook_SataDmaRead:# Setup delay counter.ldr r3, =(MS_DELAY * F_CPU / 1000)Hook_SataDmaRead_loop:# Spin until the counter is 0.sub r3, r3, #1bne Hook_SataDmaRead_loopbx lr.pool
9:# ============================================================================
# Hook the sata DMA request handler
# ============================================================================.long 0x00016720.long (9f - 0f)
0:.thumb_funcldr r7, =0xFFEAB600bx r7.pool
9:.long 0xFFFFFFFF
This assembly code inserts a small hook in sub_1671C that jumps to a code cave I placed in some unused section of RAM, and tries to stall for ~200ms. The delay calculation is based on some guess work math and isn’t scientifically accurate but it’s good enough. With this compiled I modified my test app to perform the following test:
这段汇编代码在 sub_1671C 中插入一个小钩子,跳转到一个我放置在 RAM 某些未使用区域中的代码洞穴,并试图延迟约 200ms。延迟计算是基于一些猜测的数学,并不科学准确,但足够好了。编译后,我修改了我的测试应用程序来执行以下测试:
- Loops 10 times and reads a specific sector, computes the average read time based on the times for each iteration.
循环 10 次并读取特定扇区,根据每次迭代的时间计算平均读取时间。 - Writes my patches to RAM thus introducing a ~200ms delay for each read operation.
将我的补丁写入 RAM,从而为每次读取操作引入约 200ms 的延迟。 - Repeat step 1 and compute new average read time.
重复步骤 1 并计算新的平均读取时间。
After running the app I got the following results:
运行应用程序后,我得到了以下结果:
Output from the test app.
测试应用的输出。
The first thing we notice is that the delay is not ~200ms but more like ~450, like I said the math was guess work and not scientifically accurate… The second thing we can see is all the read times except for the first in the clean test are 0ms. This is likely because we’re hitting cache and thus the drive didn’t need to seek and pull the data off the platter. For the delay test we can see the first iteration the time is also 0ms, I’m not entirely sure why but likely caching behavior as well. The rest of the read test times are all delayed which is exactly what I needed. Now it was time to test this top down with the exploit I was developing and see if the delayed read times would allow it to trigger successfully.
我们首先注意到延迟不是 ~200ms,而是更像是 ~450ms,就像我说的,计算只是估算,并不科学准确……其次我们可以看到,除了干净测试中的第一个读时间外,所有读时间都是 0ms。这很可能是因为我们命中了缓存,因此驱动不需要寻找并从盘片上读取数据。对于延迟测试,我们可以看到第一次迭代的时间也是 0ms,我不完全确定原因,但很可能也是缓存行为。其余的读测试时间都有延迟,这正好是我需要的。现在,是时候用我正在开发的后门程序从上到下进行测试了,看看延迟的读时间是否能够成功触发。
Task Failed Successfully 任务成功失败
After a few minutes of remember how this exploit actually worked and prepping new files my test setup was ready to go. The setup for this test involves a few pieces but is overall quite simple:
经过几分钟回忆这个漏洞实际上是如何工作的,并准备新文件后,我的测试设置准备就绪。这个测试的设置涉及几个部分,但总的来说相当简单:
- I have a completely unmodified HDD that I’m going to power externally and pre-patch the code in RAM with my delayed read patch. This makes it easier to test without having to worry about writing my patches back to the drive firmware/overlay modules just yet.
我有一个完全未经修改的硬盘,我将外部供电并预先在 RAM 中用我的延迟读取补丁修补代码。这使得测试更容易进行,而不必担心目前将我的补丁写回驱动程序固件/覆盖模块。 - The HDD will be connected to an Xbox 360 via SATA (data cable only). When the console boots up it’ll try to read a specific sector off the HDD. While this happens there’s some “hacking” that’s going on in the background (you’ll find out what this is in a future blog post).
硬盘将通过 SATA(仅数据线)连接到 Xbox 360。当主机启动时,它会尝试从硬盘读取特定扇区。在此期间,后台正在进行一些“黑客”活动(你将在未来的博客文章中了解到这是什么)。 - If the read operation takes long enough my exploit should trigger and the shell code that runs should lite up the console’s ring of light in full orange to indicate it worked.
如果读取操作持续足够长的时间,我的漏洞应该会被触发,运行的 shell 代码应该会使主机的光环灯全部亮起橙色,以指示它成功工作了。
However, before I did this I wanted to do what any good fake scientist would do and run a control test. That is, boot the console without any modifications to the HDD code to make sure the exploit still failed to trigger. I’d probably run several iterations of booting with no HDD modifications, making sure the exploit failed, then booting with patched HDD code and making sure the exploit triggered. This would help build confidence in how effective the delayed read patches were.
然而,在我这样做之前,我想做任何好假科学家都会做的事情,运行一个控制测试。也就是说,在不修改 HDD 代码的情况下启动控制台,以确保漏洞仍然无法触发。我可能会运行几次没有 HDD 修改的启动迭代,确保漏洞失败,然后运行带有修复的 HDD 代码的启动,确保漏洞触发。这将有助于建立对延迟读取补丁有效性的信心。
Now imagine my surprise after working 20 hours a day for 7 days straight and having been awake for almost 30 hours at the moment of this test, when I booted the console with no HDD modifications and the exploit triggered successfully. I thought it must have been a fluke, the planets must be aligned right now or something. I power cycled the console several times and the exploit triggered almost every single time despite the HDD running with no modifications applied. I even power cycled the HDD a few times just to make sure. I decided this side quest was over, time to sleep, when I awoke I would research why the exploit suddenly decided to start working.
现在想象一下,我连续 7 天每天工作 20 个小时,在测试时刻几乎清醒了 30 个小时,当我启动没有硬盘修改的机箱时,漏洞却成功触发了。我以为这一定是巧合,一定是行星排列得刚刚好之类的。我多次重启机箱,尽管硬盘没有任何修改,漏洞几乎每次都能触发。我还多次重启了硬盘以确保。我决定这个支线任务到此结束,该睡觉了,醒来后我会研究为什么漏洞突然开始工作了。
Conclusion 结论
In the days following this work I’d come to better understand all the variables surrounding the Xbox 360 exploit and never ended up needing to modify the HDD firmware. I was able to get it to work just fine with every HDD I had on hand, the only exception were SSDs which reply too fast for the exploit to trigger reliably. This was a fun adventure into the depths of embedded devices, I learned a lot and it helped me become more confident in reverse engineering low level embedded devices. Unfortunately I still have no idea how a hard drive works. There’s definitely more I would have liked to look into out of pure curiosity but without any real motivation to do so I shelved all this work. However, while messing around with AI I decided to pick some of it up again and see how well AI would do at black box analysis of embedded devices, so stay tuned for part 2!
在这项工作之后的日子里,我逐渐更好地理解了围绕 Xbox 360 漏洞的所有变量,并且最终没有必要修改硬盘固件。我能够用我手头的每个硬盘让它完美运行,唯一的例外是 SSD,它们的响应速度太快,无法可靠地触发漏洞。这是一次深入嵌入式设备内部的有趣冒险,我学到了很多,它帮助我更加自信地进行低级嵌入式设备的逆向工程。不幸的是,我仍然不知道硬盘是如何工作的。出于纯粹的好奇心,我肯定还有很多东西想要研究,但没有真正做这些的动力,我搁置了所有这些工作。然而,在玩 AI 的时候,我决定再次捡起其中一些东西,看看 AI 在嵌入式设备的黑盒分析方面表现如何,所以敬请期待第二部分!
Firmware analysis for hard drives is an interesting topic that I’ve only seen talked about once or twice despite many people having done exactly what I did before. There’s quite a bit of information available for older HDDs if you’re willing to scour through decades of vague and incorrect forum posts, but nothing that will paint a complete picture. In particular Travis Goodspeed and Sprite (Jeroen Domburg) have some interesting publications on the subject (I particularly like the anti-forensics work done by Travis). I think the reason you don’t see more information on this topic is because people were afraid it would help bad actors create malware. There’s some merit here but as you’ll see in part 2 when using AI to help with analysis this becomes a pretty moot point. Not to mention that HDD malware already exists (thanks NSA!).
硬盘固件分析是一个有趣的话题,尽管很多人之前做过和我完全一样的事情,但我只见过一两次相关的讨论。如果你愿意在几十年模糊且错误的论坛帖子中搜寻,确实能找到一些关于老式硬盘的信息,但没有任何资料能描绘出完整的图景。特别是 Travis Goodspeed 和 Sprite(Jeroen Domburg)在这个主题上有一些有趣的出版物(我个人特别喜欢 Travis 所做的反取证研究)。我认为这个话题上没有更多信息的原因是人们害怕它会被恶意行为者用来创建恶意软件。这里有几分道理,但正如你将在第二部分看到的那样,当使用 AI 来辅助分析时,这一点变得微不足道。更不用说硬盘恶意软件已经存在了(感谢 NSA!)。
Rather than keep this topic in obscurity I decided to open source the IDA and firmware related scripts I wrote to help others start looking into HDD firmware. I’m interested to see what other people find in these devices and would love to see things like backdoor commands documented, tools to try and fingerprint firmware, and maybe even decompilation of firmware (even though it’s completely pointless and I would never trust it anyway). You can find all my work on my GitHub which I’ll update again when part 2 is out.
与其让这个主题保持低调,我决定开源我编写的与 IDA 和固件相关的脚本,以帮助其他人开始研究硬盘固件。我很感兴趣地想看看其他人在这类设备中会发现什么,并希望能看到诸如后门命令的文档记录、尝试识别固件的工具,甚至固件的反编译(尽管这完全毫无意义,而且我无论如何也不会信任它)。你可以在我的 GitHub 上找到所有我的工作,当第二部分出来时我会再次更新。
