7-Zip: From Uninitialized Memory to Remote Code Execution

After my previous post on the 7-Zip bugs CVE-2017-17969 and CVE-2018-5996, I continued to spend time on analyzing antivirus software. As it happens, I found a new bug that (as the last two bugs) turned out to affect 7-Zip as well. Since the antivirus vendor has not yet published a patch, I will add the name of the affected product in an update to this post as soon as this happens.

UPDATE (2018-06-05): The antivirus vendor I was talking about was F-Secure.

Introduction

7-Zip’s RAR code is mostly based on a recent UnRAR version, but especially the higher-level parts of the code have been heavily modified. As we have seen in some of my earlier blog posts, the UnRAR code is very fragile. Therefore, it is hardly surprising that any changes to this code are likely to introduce new bugs.

Very abstractly, the bug can be described as follows: The initialization of some member data structures of the RAR decoder classes relies on the RAR handler to configure the decoder correctly before decoding something. Unfortunately, the RAR handler fails to sanitize its input data and passes the incorrect configuration into the decoder, causing usage of uninitialized memory.

Now you may think that this sounds harmless and boring. Admittedly, this is what I thought when I first discovered the bug. Surprisingly, it is anything but harmless.

In the following, I will outline the bug in more detail. Then, we will take a brief look at 7-Zip’s patch. Finally, we will see how the bug can be exploited for remote code execution.

The Bug (CVE-2018-10115)

This new bug arises in the context of handling solid compression. The idea of solid compression is simple: Given a set of files (e.g., from a folder), we can interpret them as the concatenation to one single data block, and then compress this whole block (as opposed to compressing every file for itself). This can yield a higher compression rate, in particular if there are many files that are somewhat similar.

In the RAR format (before version 5), solid compression can be used in a very flexible way: Each item (representing a file) of the archive can be marked as solid, independently from all other items. The idea is that if an item is decoded that has this solid bit set, the decoder would not reinitialize its state, essentially continuing from the state of the previous item.

Obviously, one needs to make sure that the decoder object initializes its state at the beginning (for the first item it is decoding). Let us have a look at how this is implemented in 7-Zip. The RAR handler has a method NArchive::NRar::CHandler::Extract¹ that contains a loop which iterates with a variable index over all items. In this loop, we can find the following code:

Byte isSolid = (Byte)((IsSolid(index) || item.IsSplitBefore()) ? 1: 0);
if (solidStart) {
  isSolid = 0;
  solidStart = false;
}

RINOK(compressSetDecoderProperties->SetDecoderProperties2(&isSolid, 1));

The basic idea is to have a boolean flag solidStart, which is initialized to true (before the loop), making sure that the decoder is configured with isSolid==false for the first item that is decoded. Furthermore, the decoder will (re)initialize its state (before starting to decode) whenever it is called with isSolid==false.

That seems to be correct, right? Well, the problem is that RAR supports three different encoding methods (excluding version 5), and each item can be encoded with a different method. In particular, for each of these three encoding methods there is a different decoder object. Interestingly, the constructors of these decoder objects leave a large part of their state uninitialized. This is because the state needs to be reinitialized for non-solid items anyway and the implicit assumption is that the caller of the decoder would make sure that the first call on the decoder is with isSolid==false. We can easily violate this assumption with a RAR archive that is constructed as follows²:

The first item uses encoding method v1.
The second item uses encoding method v2 (or v3), and has the solid bit set.

The first item will cause the solidStart flag to be set to false. Then, for the second item, a new Rar2 decoder object is created and (since the solid flag is set) the decoding is run with a large part of the decoder’s state being uninitialized.

At first sight, this may not look too bad. However, various parts of the uninitialized state can be used to cause memory corruptions:

Member variables holding the size of heap-based buffers. These variables may now hold a size that is larger than the actual buffer, allowing a heap-based buffer overflow.
Arrays with indices that are used to index into other arrays, for both reading and writing values.
The PPMd state discussed in my previous post. Recall that the code relies heavily on the soundness of the model’s state, which can now be violated easily.

Obviously, the list is not complete.

The Fix

In essence, the bug is that the decoder classes do not guarantee that their state is correctly initialized before they are used for the first time. Instead, they rely on the caller to configure the decoder with isSolid==false before the first item is decoded. As we have seen, this does not turn out very well.

There are two different approaches to resolve this bug:

Make the constructor of the decoder classes initialize the full state.
Add an additional boolean member solidAllowed (which is initialized to false) to each decoder class. If isSolid==true even though solidAllowed==false, the decoder can abort with a failure (or set isSolid=false).

UnRAR seems to implement the first option. Igor Pavlov, however, chose to go with a variant of the second option for 7-Zip.

In case you want to patch a fork of 7-Zip or you are just interested in the details of the fix, you might want to have a look at this file, which summarizes the changes.

On Exploitation Mitigation

In the previous post on the 7-Zip bugs CVE-2017-17969 and CVE-2018-5996, I mentioned the lack of DEP and ASLR in 7-Zip before version 18.00 (beta). Shortly after the release of that blog post, Igor Pavlov released 7-Zip 18.01 with the /NXCOMPAT flag, delivering on his promise to enable DEP on all platforms. Moreover, all dynamic libraries (7z.dll, 7-zip.dll, 7-zip32.dll) have the /DYNAMICBASE flag and a relocation table. Hence, most of the running code is subject to ASLR.

However, all main executables (7zFM.exe, 7zG.exe, 7z.exe) come without /DYNAMICBASE and have a stripped relocation table. This means that not only are they not subject to ASLR, but you cannot even enforce ASLR with a tool like EMET or its successor, the Windows Defender Exploit Guard.

Obviously, ASLR can only be effective if all modules are properly randomized. I discussed this with Igor and convinced him to ship the main executables of the new 7-Zip 18.05 with /DYNAMICBASE and relocation table. The 64-bit version still runs with the standard non-high entropy ASLR (presumably because the image base is smaller than 4GB), but this is a minor issue that can be addressed in a future release.

On an additional note, I would like to point out that 7-Zip never allocates or maps additional executable memory, making it a great candidate for Arbitrary Code Guard (ACG). In case you are using Windows 10, you can enable it for 7-Zip by adding the main executables 7z.exe, 7zFM.exe, and 7zG.exe in the Windows Defender Security Center (App & browser control -> Exploit Protection -> Program settings). This will essentially enforce a W^X policy and therefore make exploitation for code execution substantially more difficult.

Writing a Code Execution Exploit

Normally, I would not spend much time thinking about actual weaponized exploits. However, it can sometimes be instructive to write an exploit, if only to learn how much it actually takes to succeed in the given case.

The platform we target is a fully updated Windows 10 Redstone 4 (RS4, Build 17134.1) 64-bit, running 7-Zip 18.01 x64.

Picking an Adequate Exploitation Scenario

There are three basic ways to extract an archive using 7-Zip:

Open the archive with the GUI and either extract files separately (using drag and drop), or extract the whole archive using the Extract button.
Right-click the archive and select "7-Zip->Extract Here" or "7-Zip->Extract to subfolder" from the context menu.
Using the command-line version of 7-Zip.

Each of these three methods will invoke a different executable (7zFM.exe, 7zG.exe, 7z.exe). Since we want to exploit the lack of ASLR in these modules, we need to fix the extraction method.

The second method (extraction via context menu) seems to be the most attractive one, since it is a method that is probably used very often, and at the same time it should give us a quite predictable behavior (unlike the first method, where a user might decide to open the archive but then extract the “wrong” file). Hence, we go with the second method.

Exploitation Strategy

Using the bug from above, we can create a Rar decoder that operates on (mostly) uninitialized state. So let us see for which Rar decoder this may allow us to corrupt the heap in an attacker-controlled manner.

One possibility is to use the Rar1 decoder. The method NCompress::NRar1::CDecoder::HuffDecode³ contains the following code:

int bytePlace = DecodeNum(...);
// some code omitted
bytePlace &= 0xff;
// more code omitted
for (;;)
{
  curByte = ChSet[bytePlace];
  newBytePlace = NToPl[curByte++ & 0xff]++;
  if ((curByte & 0xff) > 0xa1)
    CorrHuff(ChSet, NToPl);
  else
    break;
}

ChSet[bytePlace] = ChSet[newBytePlace];
ChSet[newBytePlace] = curByte;
return S_OK;

This is very useful, because the uninitialized state of the Rar1 decoder includes the uint32_t arrays ChSet and NtoPl. Hence, newBytePlace is an attacker-controlled uint32_t, and so is curByte (with the restriction that the least significant byte cannot be larger than 0xa1). Moreover, bytePlace is determined by the input stream, so it is attacker-controlled as well (but cannot be larger than 0xff).

So this would give us a pretty good (though not perfect) read-write primitive. Note, however, that we are in a 64-bit address space, so we will not be able to reach the vtable pointer of the Rar1 decoder object with a 32-bit offset (even if multiplied by sizeof(uint32_t)) from ChSet. Therefore, we will target the vtable pointer of an object that is placed after the Rar1 decoder on the heap.

The idea is to use a Rar3 decoder object for this purpose, which we will use at the same time to hold our payload. In particular, we use the RW-primitive from above to swap the pointer _windows, which is a member variable of the Rar3 decoder, with the vtable pointer of the very same Rar3 decoder object. _window points to a 4MB-sized buffer which holds data that has been extracted with the decoder (i.e., it is fully attacker-controlled).

Naturally, we will fill the _window buffer with the address of a stack pivot (xchg rax, rsp), followed by a ROP chain to obtain executable memory and execute the shellcode (which we also put into the _window buffer).

Putting a Replacement Object on the Heap

In order to succeed with the outlined strategy, we need to have full control of the decoder’s uninitialized memory. Roughly speaking, we will do this by making an allocation of the size of the Rar1 decoder object, writing the desired data to it, and then freeing it at some point before the actual Rar1 decoder is allocated.

Obviously, we will need to make sure that the Rar1 decoder’s allocation actually reuses the same chunk of memory that we freed before. A straightforward way to achieve this is to activate Low Fragmentation Heap (LFH) on the corresponding allocation size, then spray the LFH with multiple of those replacement objects. This actually works, but because allocations on the LFH are randomized since Windows 8, this method will never be able to place the Rar1 decoder object in constant distance to any other object. Therefore, we try to avoid the LFH and place our object on the regular heap. Very roughly, the allocation strategy is as follows:

Create around 18 pending allocations of all (relevant) sizes smaller than the Rar1 decoder object. This will activate LFH for these allocation sizes and prevent such small allocations from destroying our clean heap structure.
Allocate the replacement object and free it, making sure it is surrounded by busy allocations (and hence not merged with other free chunks).
Rar3 decoder is allocated (the replacement object is not reused, because the Rar3 decoder is larger than the Rar1 decoder).
Rar1 decoder is allocated (reusing the replacement object).

Note that it is unavoidable to allocate some decoder before allocating that Rar1 decoder, because only this way the solidStart flag will be set to false and the next decoder will not be initialized correctly (see above).

If everything works as planned, the Rar1 decoder reuses our replacement object, and the Rar3 decoder object is placed with some constant offset after the Rar1 decoder object.

Allocating and Freeing on the Heap

Obviously, the above allocation strategy requires us to be able to make heap allocations in a reasonably controlled manner. Going through the whole code of the RAR handler, I could not find many good ways to make dynamic allocations on the default process heap that have attacker-controlled size and store attacker-controlled content. In fact, it seems that the only way to do such dynamic allocations is via the names of the archive’s items. Let us see how this works.

When an archive is opened, the method NArchive::NRar::CHandler::Open2¹ reads all items of the archive with the following code (simplified):

CItem item;

for (;;)
{
  // some code omitted
  bool filled;
  archive.GetNextItem(item, getTextPassword, filled, error);
  // some more code omitted
  if (!filled) {
    // some more code omitted
    break;
  }
  if (item.IgnoreItem()) { continue; }
  bool needAdd = true;
  // some more code omitted
  _items.Add(item);
  
}

The class CItem has a member variable Name of type AString, which stores the (ASCII) name of the corresponding item in a heap-allocated buffer.

Unfortunately, the name of an item is set as follows in NArchive::NRar::CInArchive::ReadName¹:

for (i = 0; i < nameSize && p[i] != 0; i++) {}
item.Name.SetFrom((const char *)p, i);

I say unfortunately, because this means that we cannot write completely arbitrary bytes to the buffer. In particular, it seems that we cannot write null bytes. This is bad, because the replacement object we want to put on the heap requires a few zero bytes. So what can we do? Well, let us look at AString::SetFrom⁴:

void AString::SetFrom(const char *s, unsigned len)
{
  if (len > _limit)
  {
    char *newBuf = new char[len + 1];
    delete []_chars;
    _chars = newBuf;
    _limit = len;
  }
  if (len != 0)
    memcpy(_chars, s, len);
  _chars[len] = 0;
  _len = len;
}

Okay, so this method will always terminate the string with a null byte. Moreover, we see that AString keeps the same underlying buffer, unless it is too small to hold the desired string. This gives rise to the following idea: Assume we want to write the hex-bytes DEAD00BEEF00BAAD00 to some heap-allocated buffer. Then we will just have an archive with items that have the following names (in the listed order):

DEAD55BEEF55BAAD
DEAD55BEEF
DEAD

Basically, we let the method SetFrom write all null bytes we need. Note that we have replaced all null bytes in our data with some arbitrary non-zero byte (0x55 in this example), ensuring that the full string is written to the buffer.

This works reasonably well, and we can use this to write arbitrary sequences of bytes, with two small limitations. First, we have to end our sequence with a null byte. Second, we cannot have too many null bytes in our byte sequence, because this will cause a quadratic blow-up of the archive size. Luckily, we can easily work with those restrictions in our specific case.

Finally, note that we can make essentially two types of allocations:

Allocations with items such that item.IgnoreItem()==true. Those items will not be added to the list _items, and are hence only temporary. These allocations have the property that they will be freed eventually, and they can (using the above technique) be filled with almost arbitrary sequences of bytes. Since these allocations are all made via the same stack-allocated object item and hence use the same AString object, the allocation sizes of this type need to be strictly increasing in their size. We will use this allocation type mainly to put the replacement object on the heap.
Allocations with items such that item.IgnoreItem()==false. Those items will be added to the list _items, causing a copy of the corresponding name. This is useful in particular to cause many pending allocations of certain sizes in order to activate LFH. Note that the copied string cannot contain any null bytes, which is fine for our purposes.

Combining the outlined methods carefully, we can construct an archive that implements the heap allocation strategy from the previous section.

ROP

We leverage the lack of ASLR on the main executable 7zG.exe to bypass DEP with a ROP chain. 7-Zip never calls VirtualProtect, so we read the addresses of VirtualAlloc, memcpy, and exit from the Import Address Table to write the following ROP chain:

// pivot stack: xchg rax, rsp;
exec_buffer = VirtualAlloc(NULL, 0x1000, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
memcpy(exec_buffer, rsp+shellcode_offset, 0x1000);
jmp exec_buffer;
exit(0);

Since we are running on x86_64 (where most instructions have a longer encoding than in x86) and the binary is not very large, for some of the operations we want to execute there are no neat gadgets. This is not really a problem, but it makes the ROP chain somewhat ugly. For example, in order to set the register R9 to PAGE_EXECUTE_READWRITE before calling VirtualAlloc, we use the following chain of gadgets:

0x40691e, #pop rcx; add eax, 0xfc08500; xchg eax, ebp; ret; 
PAGE_EXECUTE_READWRITE, #value that is popped into rcx
0x401f52, #xor eax, eax; ret; (setting ZF=1 for cmove)
0x4193ad, #cmove r9, rcx; imul rax, rdx; xor edx, edx; imul rax, rax, 0xf4240; div r8; xor edx, edx; div r9; ret;

Demo

The following demo video briefly presents the exploit running on a freshly installed and fully updated Windows 10 RS4 (Build 17134.1) 64-bit with 7-Zip 18.01 x64. As mentioned above, the targeted exploitation scenario is extraction via the context menu 7-Zip->Extract Here and 7-Zip->Extract to subfolder.

On Reliability

After some fine-tuning of the auxiliary heap allocation sizes, the exploit seems to work very reliably.

In order to obtain more information on reliability, I wrote a small script that repeatedly calls the binary 7zG.exe the same way it would be called when extracting the crafted archive via the context menu. Moreover, the script checks that calc.exe is actually started and the process 7zG.exe exits with code 0. Running the script on different Windows operating systems (all fully updated), the results are as follows:

Windows 10 RS4 (Build 17134.1) 64-bit: the exploit failed⁵ 17 out of 100 000 times.
Windows 8.1 64-bit: the exploit failed 12 out of 100 000 times.
Windows 7 SP1 64-bit: the exploit failed 90 out of 100 000 times.

Note that across all operating systems, the very same crafted archive is used. This works well, presumably because most changes between the Windows 7 and Windows 10 heap implementation affect the Low Fragmentation Heap, whereas the rest has not changed too much. Moreover, the LFH is still triggered for the same number of pending allocations.

Admittedly, it is not really possible to determine the reliability of an exploit empirically. Still, I believe this to be better than “I ran it a few times, and it seems to be reliable”.

Conclusion

In my opinion, this bug is a consequence of the design (partially) inherited from UnRAR. If a class depends on its clients to use it correctly in order to prevent usage of uninitialized class members, you are doomed for failure.

We have seen how this (at first glance) innocent looking bug can be turned into a reliable weaponized code execution exploit. Due to the lack of ASLR on the main executables, the only difficult part of the exploit was to carry out the heap massaging within the restricted context of RAR extraction.

Fortunately, the new 7-Zip 18.05 not only resolves the bug, but also comes with enabled ASLR on all the main executables.

Do you have any comments, feedback, doubts, or complaints? I would love to hear them. You can find my e-mail address on the about page.

Alternatively, you are invited to join the discussion on HackerNews or on /r/netsec.

Timeline of Disclosure

2018-03-06 - Discovery
2018-03-06 - Report
2018-04-14 - MITRE assigned CVE-2018-10115
2018-04-30 - 7-Zip 18.05 released, fixing CVE-2018-10115 and enabling ASLR on the executables.

Thanks & Acknowledgements

I would like to thank Igor Pavlov for fixing the bug and for enabling further exploitation mitigations in 7-Zip.

See CPP/7zip/Archive/Rar/RarHandler.cpp. ↩︎
Obviously, the versions (v1, v2, v3) can be permuted arbitrarily. ↩︎
See CPP/7zip/Compress/Rar1Decoder.cpp. ↩︎
See CPP/Common/MyString.cpp. ↩︎
failed means that the process segfaulted or an early extraction error occurred. ↩︎

May 1, 2018