A integer overflow in the big array serialization module leads to arbitrary code execution #7765

vicuna · 2018-04-06T15:41:06Z

Original bug ID: 7765
Reporter: maxdos
Status: resolved (set by @xavierleroy on 2018-04-12T07:37:40Z)
Resolution: fixed
Priority: urgent
Severity: major
Version: 4.06.0
Target version: 4.07.0+dev/beta2/rc1/rc2
Fixed in version: 4.07.0+dev/beta2/rc1/rc2
Category: standard library
Monitored by: @nojb "Richard Jones"

Bug description

The bigarray module in all recent ocaml versions is capable of reading in serialized (marshalled) objects from a external source which is often used for network operations and interprocess communication.

ocaml/byterun/bigarray.c

Line 458 in ea60609

b->data = malloc(elt_size * num_elts);

A integer overflow vulnerability allows under certain circumstances an remote attacker to input a corrupt object and edit arbitrary addresses in the processes memory which leads over common techniques (in this take libc one gadget but any rop gadget or got table vector would work) to arbitrary memory access (!) and in the course also arbitrary code execution

Please check the writeup.pdf for more details !!

In variations of this the exploit is usable in all ocaml versions we tested no matter if compiled to binary or intepreted (!)

Steps to reproduce

Check out the attached archive and run the makefile (or use the precompile executable)
the exploit can be run by executing exploit.py (python2.7 and pwntools are required)
The last step of code execution is using a one-gadget attack technique and is specific to the glibc version in use (e.g. tested is debian glibc 2.24-11+deb9u1) but can with minimal amount of work be ported to any libc version on any patch level to our knowledge

Of course you can also use the exploit.py to write your own version of exploit

Additional information

This Vulnerability in ocaml was discovered by me (maximilian.tschirschnitz@gmx.de) and the exploit collaboratly developed with (philipp.hagenlocher@tum.de)
For illustration purposes it was packaged as security challenge to demonstrate the impact of the issue (see writeup.pdf)

vicuna · 2018-04-06T15:56:11Z

Comment author: @alainfrisch

I think it is a well-known fact that a program should not use the generic demarshaller on untrusted data. This should perhaps be made a bit more explicit, but the document of Marshal already says:

   Anything can happen at run-time if the object in the file does not
   belong to the given type.

which is often used for network operations and interprocess communication.

Are you aware of actual uses of Marshal for such operations between agents that don't trust each other?

vicuna · 2018-04-06T16:00:31Z

Comment author: maxdos

Well sending corrupt data over the network should not lead to a memory corruption of my bigarray buffer ?
Of course anything can happen in a sense of wrong behaviour (segfault, exception)
But in a broader sense following this argumentation would mean ocaml is not safe for any usage over network in general
Which is not the case right ?

Yes the one i created in the example, and of course this error could allow any ocaml implementation based on this library cause to be exploitable through it

vicuna · 2018-04-06T16:03:48Z

Comment author: maxdos

I have to point out that you guys did a great job in all the other cases
The malloc and not guarded integer multiplication at this point clearly seems to be a glitch
check out this line in the same file

ocaml/byterun/bigarray.c

Line 108 in ea60609

if (caml_umul_overflow(num_elts, dimcopy[i], &num_elts))

Where it is implemented safe

vicuna · 2018-04-06T16:18:28Z

Comment author: @alainfrisch

I agree it is worth protecting against this specific source of overflow.

But let me insist: Marshal is known to not be safe, and should really not be used for communication if you don't trust the other end.

this argumentation would mean ocaml is not safe for any usage over network in general

How do you draw such a conclusion? If you read an integer from an unstrusted source and call Obj.magic on it to cast it to a bytes values, you can obviously let the "attacker" read/write arbitrary memory. It only means that you should not do that. And as far as I know, nobody demarshals data received from untrusted counterparties.

vicuna · 2018-04-06T16:24:36Z

Comment author: maxdos

I agree it is worth protecting against this specific source of overflow.

Yes that is my main point, you guys did great work in protecting the rest of it so why not fix this with the protection you already implemented and make ocamls stdlib more secure :)

Also i agree that serializing over network is always dangerous

But imagine scenarios of man-in-the-middle where a package you are serializing can be corrupted or maybe you unserialize a corrupted file you got through
someone or some interface
Also we are not casting anything to byte values here but just using "black box" functions of bigarray (namely input_value which seems pretty innocent)
which is dangerous even if you warn the users
And another (admitably picky) point: im not sending a wrong type but a corrupt object of the correct type :D

and overall its a small patch which adds great security to your already pretty secure marshaling interface :)

vicuna · 2018-04-06T17:06:24Z

Comment author: @alainfrisch

Even when this will be fixed, demarshaling will not really become more safe against corruption or attacks. If you expect to unmarshal a value of type "bytes" but the attacker sends a marshaled integer, you will easily end up writing at arbitrary memory location controlled by the attacker.

vicuna · 2018-04-06T17:56:44Z

Comment author: @xavierleroy

Well spotted, thanks! We'll fix this overflow issue by the next release, of course.

@Frisch is correct that there are other ways to abuse unmarshaling at the level of the OCaml application. But those can often be avoided by post-unmarshaling data validation, for example like in Gregoire Henry's PhD work. In contrast, unsafety in the run-time system is fatal and cannot be worked around. So we'll fix the unsafety ASAP.

vicuna · 2018-04-10T07:11:17Z

Comment author: Richard Jones

Just reviewing this issue for Fedora & RHEL. It looks like there isn't yet an upstream patch, is that right?

vicuna · 2018-04-10T07:32:42Z

Comment author: @alainfrisch

In contrast, unsafety in the run-time system is fatal and cannot be worked around.

Which safety properties do we expect from the demarshaler, applied to a block crafted by the attacker?

I don't see any protection against arbitrary memory reads in the demarshaler. For instance, CODE_STRING64 will get an arbitrary 64-bit integer length, and happily read (with readblock) that much memory from the input buffer without bound checks. If arbitrary memory reads are not bad enough, the same code also do:

   Field(v, size - 1) = 0;

or things like:

   *intern_dest = Make_header_allocated_here(Double_wosize, Double_tag,
                                              intern_color);

which could write at an attacker-controlled memory location (nothing prevents intern_dest from overflowing the allocated target buffer).

Also, the attacker could trivially force the whole program to stop (just emit CODE_SHARED8 followed by \0).

vicuna · 2018-04-10T10:09:07Z

Comment author: maxdos

@Frisch: but there we are talking about custom object correct ?
I think this is a different since if you allow custom objects you are risking more.

vicuna · 2018-04-10T12:14:00Z

Comment author: @alainfrisch

CODE_STRING64 is not related to custom objects. The problem you reported (with bigarrays) is indeed, but I think this is irrelevant, since support for custom objects cannot be disabled.

vicuna · 2018-04-11T11:29:25Z

Comment author: @xavierleroy

Proposed fix: #1718

vicuna · 2018-04-12T07:37:40Z

Comment author: @xavierleroy

Fix reviewed and merged, will be in release 4.07.0.

@Frisch is right: other sanity checks are missing in the unmarshaler and should be investigated soon.

vicuna · 2018-04-13T08:44:27Z

Comment author: @alainfrisch

@Frisch is right: other sanity checks are missing in the unmarshaler and should be investigated soon.

I'm not against adding more check, but this will add quite a bit of weight to the code (I expect the impact on performance to remain small, though).

But I'm dubious at the idea of relying on post hoc checks of the unmarshal data to ensure security. I can see some use of such techniques to reduce the risk of versioning or corruption problems, but advertising this as a secure approach seems dangerous to me.

Validation of unmarshaled data requires a runtime representation of the data structure definition, and involve some non negligible runtime cost. At this point, why would one want to rely on the generic marshaling rather than a type-driven serialization (which could be more compact, btw)? The difficult part would be to deal with sharing data, but this is also quite difficult for the validator, as it needs to reject sharing of mutable data with different types.

I'd feel much more comfortable if we advised to "never unmarshal untrusted data" (and similarly, "never dynlink untrusted code", if this is not obvious).

vicuna · 2018-04-13T12:19:06Z

Comment author: maxdos

I understand your argument regarding runtime
But the point is that the way the marshal module is used it implies security
Since the user relies on that you check for example array sizes and so on
(not talking about custom objects here)

Your suggestions sounds like: Why dont we say ocaml is not safe to use over network and then we are done ?

vicuna · 2018-04-13T12:36:50Z

Comment author: @alainfrisch

But the point is that the way the marshal module is used it implies security

Absolutely not, and that's because you were somehow lead to believe that that I believe it's important to be clear in the documentation.

Marshal should not used in contexts where an attacker can control the data. I don't believe it is, at least in any project I'm aware of, and if it were, it's unlikely that those project perform enough check on the result of Marshal to make the use safe anyway.

Why dont we say ocaml is not safe to use over network and then we are done ?

This has nothing to do with the use of ocaml over the network, it's about using a library documented to be unsafe on untrusted data. Of course, it's a bad idea, but this is not breaking news.

vicuna · 2018-04-13T12:39:38Z

Comment author: @alainfrisch

Again: are you aware of any use of Marshal to pass data over the network, in a context where you don't trust the other side ?

Malicious or corrupted marshaled data can result in a bigarray with impossibly large dimensions that cause overflow when computing the in-memory size of the bigarray. Disaster ensues when the data is read in a too small memory area. This commit checks for overflows when computing the in-memory size of the bigarray. This patch is based on one by Xavier Leroy and has been modified to use caml_ba_multov instead of caml_umul_overflow which is unavailable in OCaml 4.05.0. Origin: ocaml#1718 Bug: ocaml#7765 Bug-Debian: https://bugs.debian.org/895472 Bug-CVE: CVE-2018-9838

Malicious or corrupted marshaled data can result in a bigarray with impossibly large dimensions that cause overflow when computing the in-memory size of the bigarray. Disaster ensues when the data is read in a too small memory area. This commit checks for overflows when computing the in-memory size of the bigarray. This patch is based on one by Xavier Leroy and has been modified to use caml_ba_multov instead of caml_umul_overflow which is unavailable in OCaml 4.05.0. The original commit hash is 85162ee. Origin: ocaml#1718 Bug: ocaml#7765 Bug-Debian: https://bugs.debian.org/895472 Bug-CVE: CVE-2018-9838

vicuna closed this as completed Apr 12, 2018

vicuna added high-priority stdlib labels Mar 14, 2019

vicuna added this to the 4.07.0 milestone Mar 14, 2019

vicuna added the bug label Mar 20, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A integer overflow in the big array serialization module leads to arbitrary code execution #7765

A integer overflow in the big array serialization module leads to arbitrary code execution #7765

vicuna commented Apr 6, 2018

vicuna commented Apr 6, 2018

vicuna commented Apr 6, 2018

vicuna commented Apr 6, 2018

vicuna commented Apr 6, 2018

vicuna commented Apr 6, 2018

vicuna commented Apr 6, 2018

vicuna commented Apr 6, 2018

vicuna commented Apr 10, 2018

vicuna commented Apr 10, 2018

vicuna commented Apr 10, 2018

vicuna commented Apr 10, 2018

vicuna commented Apr 11, 2018

vicuna commented Apr 12, 2018

vicuna commented Apr 13, 2018

vicuna commented Apr 13, 2018

vicuna commented Apr 13, 2018

vicuna commented Apr 13, 2018

A integer overflow in the big array serialization module leads to arbitrary code execution #7765

A integer overflow in the big array serialization module leads to arbitrary code execution #7765

Comments

vicuna commented Apr 6, 2018

Bug description

Steps to reproduce

Additional information

vicuna commented Apr 6, 2018

vicuna commented Apr 6, 2018

vicuna commented Apr 6, 2018

vicuna commented Apr 6, 2018

vicuna commented Apr 6, 2018

vicuna commented Apr 6, 2018

vicuna commented Apr 6, 2018

vicuna commented Apr 10, 2018

vicuna commented Apr 10, 2018

vicuna commented Apr 10, 2018

vicuna commented Apr 10, 2018

vicuna commented Apr 11, 2018

vicuna commented Apr 12, 2018

vicuna commented Apr 13, 2018

vicuna commented Apr 13, 2018

vicuna commented Apr 13, 2018

vicuna commented Apr 13, 2018