diff options
Diffstat (limited to 'content/articles')
-rw-r--r-- | content/articles/2022-08-29-blog-start.md | 9 | ||||
-rw-r--r-- | content/articles/2022-08-29-blog-test.md | 41 | ||||
-rw-r--r-- | content/articles/2022-08-30-blog-tools.md | 104 | ||||
-rw-r--r-- | content/articles/2022-09-19-gctf-readysetaction.md | 84 | ||||
-rw-r--r-- | content/articles/2022-09-25-ductf-file-magic.md | 255 | ||||
-rw-r--r-- | content/articles/2022-11-07-programming-language-design.md | 70 | ||||
-rw-r--r-- | content/articles/2022-11-10-artist-correlation.md | 78 |
7 files changed, 0 insertions, 641 deletions
diff --git a/content/articles/2022-08-29-blog-start.md b/content/articles/2022-08-29-blog-start.md deleted file mode 100644 index 2f17e6c..0000000 --- a/content/articles/2022-08-29-blog-start.md +++ /dev/null @@ -1,9 +0,0 @@ -# Starting my blog - -This will be the place where I will post short articles about my thoughts and projects in the future, whenever I feel like it. - -A list of topics of which I already have an idea in mind are: - -- The tools I used to create this blog -- My dream operating system (maybe even logs about me trying (aka failing) to implement it) -- In search of a perfect text editor diff --git a/content/articles/2022-08-29-blog-test.md b/content/articles/2022-08-29-blog-test.md deleted file mode 100644 index 1c4a28a..0000000 --- a/content/articles/2022-08-29-blog-test.md +++ /dev/null @@ -1,41 +0,0 @@ -# Blog test - -Hello world! - -## heading - -_italic_ **bold** `code` [Link](https://metamuffin.org/) - -### sub heading - -Look at this code here. - -```rs -#[derive(Default)] -struct Impossible<T> { - something: HashMap<usize, T>, - // ^- look here, a keyword! --the grammar regex - itself: Impossible, -} - -pub async fn useless(s: Box<dyn Write>) -> impl Future<usize> { - todo!() // this panics -} -fn main() { - 1 + "blub" - for (x, y) in [(1,2),(3,4)] { - println!("{x} and {y}"); - Impossible::default(); - } -} -``` - -As you can see, its pointless i.e. no points - -- list -- list -- list - -1. first this -2. then that -3. done diff --git a/content/articles/2022-08-30-blog-tools.md b/content/articles/2022-08-30-blog-tools.md deleted file mode 100644 index 81d1ff9..0000000 --- a/content/articles/2022-08-30-blog-tools.md +++ /dev/null @@ -1,104 +0,0 @@ -# Tools I use to create this blog - -As you might expect, this blog uses some overengineered tools, after all, it -wouldn't be so much fun otherwise. This article is probably not too interesting -on its own. It's just "documentation". - -// TODO upload source code and link it here. - -## "Build system" - -```tree -. -├── ... (readme 'n stuff) -├── code -│ ├── Cargo.toml -│ └── src -│ └── ... (rust source) -└── content - ├── articles - │ ├── 2022-08-29-blog-start.md - │ ├── 2022-08-29-blog-test.md - │ └── 2022-08-30-blog-tools.md - ├── makefile - ├── out - │ ├── ... (generated files) - └── style.css -``` - -The entry point to this "build system" is `content/makefile`, it has rules for -generating HTM from the markdown sources, the index and the atom feed. The -"compiler" here is a small rust program. (`code`). - -```makefile -# oversimplifed, doesnt work -TOOL := blog-tool - -out/index: $(ALL_ARTICLES) $(TOOL) - $(TOOL) render-index > $@ -out/feed.atom: $(ALL_ARTICLES) $(TOOL) - $(TOOL) generate-atom > $@ -out/%: articles/%.md $(TOOL) - $(TOOL) render-article $< > $@ -``` - -A small trick here is, to make everything depend on the compiler (`$(TOOL)`) -too, so that when it changes, re-generation of all articles if triggered. - -## Rust tools - -I use the [laby](https://crates.io/crates/laby) crate for templating the HTM. - -// TODO what is important here?! - -### File server - -I wanted to serve all files without the file name extension (`.html`), but my -previous http server ([http-server](https://github.com/http-party/http-server)) -exclusively inferred the MIME type from the extension, which makes it impossible -to do that. The obvious solution was to reinvent the wheel. - -This had the great side-effect of making my website blazing fast 🚀🚀! :) - -#### http-server (node.js) - -``` -Running 10s test @ http://127.0.0.1:8080/test-file - 2 threads and 100 connections -Requests/sec: 2314.00 -Transfer/sec: 725.38KB -``` - -#### fileserver (rust, no caching) - -``` -Running 10s test @ http://127.0.0.1:8080/test-file - 2 threads and 100 connections -Requests/sec: 24464.69 -Transfer/sec: 3.10MB -``` - -// TODO also upload source code and link it - -### Syntax highlighing - -For that, I chose the crate [synoptic](https://crates.io/crates/synoptic). It -provides a some functions for defining tokens with a regex, then returning -start- and end-points of those for further processing. For example, this is how -i defined comments and types: - -```rs -let rust_grammar = &[ - (&["(?m)(//.*)$"], "comment"), - (&["[A-Z][a-z]*", "bool", "usize", /* simplified */ ], "type"), - /* more rules */ -] -``` - -The library finds all the tokens and lets me serialize them to HTM. - -## End - -I still need to learn, how to write this blog well and what is the most -interesting to read. Please give me some advice or commentary via matrix, mail -or fedi. (see [contact](https://metamuffin.org/contact)) diff --git a/content/articles/2022-09-19-gctf-readysetaction.md b/content/articles/2022-09-19-gctf-readysetaction.md deleted file mode 100644 index 0d5b719..0000000 --- a/content/articles/2022-09-19-gctf-readysetaction.md +++ /dev/null @@ -1,84 +0,0 @@ -# Google Beginner CTF: ReadySetAction - -**SPOILER WARNING: Go play the CTF at -[capturetheflag.withgoogle.com](https://capturetheflag.withgoogle.com/) first** - -The challenge was a short python script with the following code: - -```py -from Crypto.Util.number import * - -flag = b"REDACTED" - -p = getPrime(1024) -q = getPrime(1024) -n = p*q - -m = bytes_to_long(flag) - - -c = pow(m,3,n) - -print(c) -print(n) -#154780 ... 6709 -#210348 ... 4477 -#(i removed both 617 digit long numbers here) -``` - -It was immediatly obvious to me that this has to do with RSA cryptography -because it multiplies large primes and then does some powmod magic. Because I -didn't remember exactly how RSA worked, I quickly read the Wikipedia article -again - specifically section -[4.1 Attacks against plain RSA](https://en.wikipedia.org/wiki/RSA_%29cryptosystem%29#Attacks_against_plain_RSA) - -The first item already seems like the solution: - -> When encrypting with low encryption exponents (e.g., e = 3) and small values -> of the m (i.e., m < n1/e), the result of me is strictly less than the modulus -> n. In this case, ciphertexts can be decrypted easily by taking the eth root of -> the ciphertext over the integers. - -Which would mean that $\sqrt[3]{c}$ is a solution _if_ $m$ is small enough. -However - -$c = m^3 \mod n \\$ - -so the exponentiation could yield a much higher result that we dont get because -it is wrapped at $n$. So we can try out how often it wrapped: - -$m = \sqrt[3]{c + x*n} \quad x \in \N\\$ - -where $x$ can be brute-forced if $m$ is sufficiently small. Back in python it -looks like this: - -```py -c = ... -n = ... - -# simple but bad binary search for n-th root -def nth_root(c,e): - high = 1 - while high**e < c: - high *= 2 - low = 0 - mid = 0 - last_mid = 0 - while mid**e != c: - mid = (low + high) // 2 - if last_mid == mid: - return 0 # i use 0 to indicate a failure - if mid**e > c: - high = mid - else: - low = mid - last_mid = mid - return mid - -for x in range(100000): - m = nth_root(c + n*x, e) - # the probability of finding a perfect cube number is low, so any result is what we want - if m != 0: print(long_to_bytes(m)) -``` - -Thats it! With $x=1831$ we get our flag: `CTF{34sy_RS4_1s_e4sy_us3}` diff --git a/content/articles/2022-09-25-ductf-file-magic.md b/content/articles/2022-09-25-ductf-file-magic.md deleted file mode 100644 index f4b55c9..0000000 --- a/content/articles/2022-09-25-ductf-file-magic.md +++ /dev/null @@ -1,255 +0,0 @@ -# DownUnderCTF 2022: File Magic - -A short writeup about my favorite challenge from DUCTF. It took me approximatly -12h to solve. I found it was the most creative and challenging one that i -solved. - -## Task - -The challenge consists of a python script and an ip-port pair which appears to -be running that script. Also the path of the flag is given: `./flag.txt` - -```py -#!/usr/bin/env python3 - -from Crypto.Cipher import AES -from PIL import Image -from tempfile import NamedTemporaryFile -from io import BytesIO -import subprocess, os - -KEY = b'downunderctf2022' - -iv = bytes.fromhex(input('iv (hex): ')) -assert len(iv) == 16 and b'DUCTF' in iv, 'Invalid IV' - -data = bytes.fromhex(input('file (hex): ')) -assert len(data) % 16 == 0, 'Misaligned file length' -assert len(data) < 1337, 'Oversized file length' - -data_buf = BytesIO(data) -img = Image.open(data_buf, formats=['jpeg']) -assert img.width == 13 and img.height == 37, 'Invalid image size' -assert img.getpixel((7, 7)) == (7, 7, 7), 'Invalid image contents' - -aes = AES.new(KEY, iv=iv, mode=AES.MODE_CBC) -dec = aes.decrypt(data) -assert dec.startswith(b'\x7fELF'), 'Not an ELF' - -f = NamedTemporaryFile(delete=False) -f.write(dec) -f.close() - -os.chmod(f.name, 0o777) -pipes = subprocess.Popen([f.name], stdout=subprocess.PIPE) -stdout, _ = pipes.communicate() -print(stdout.decode()) - -os.remove(f.name) -``` - -So, for anything to make it past these checks and be executed it must: - -1. be a valid 13x37 JPEG image with the pixel at 7,7 set to `#070707` -2. be a valid ELF binary that reads `./flag.txt` after decrypting with AES CBC, - fixed key (`downunderctf2022`) and the provided IV -3. The IV must contain `DUCTF` - -During the competition I discovered the information in the next three headings -in parallel but internally in-order. - -## 1. AES CBC "flaw" - -We need to generate a file that is a sort-of polyglot with JPEG and ELF, -converted with AES CBC (Cipher block chaining). - -AES itself operates on 16-byte (for 128-bit AES) blocks, so bigger files are -split and then encrypted separately. To ensure that identical blocks don't -result in identical blocks in ciphertext, each block is first xor'd with -something that won't be identical. In the case of CBC, the last ciphertext block -or the initialisation vector (IV) is used. Here is a diagram for encryption - -``` - ___plaintext____|___plaintext____|___plaintext____|... - v v v -IV--->XOR ,---------->XOR ,--------->XOR ,---- ... - v | v | v | - AES | AES | AES | - v---' v---' v---' - ___ciphertext___|___ciphertext___|___ciphertext___|... -``` - -For decryption we can just flip the diagram and replace AES with reverse AES. - -``` - ___ciphertext___|___ciphertext___|___ciphertext___|... - v---, v---, v---, - ∀EZ | ∀EZ | ∀EZ | - v | v | v | -IV--->XOR '---------->XOR '--------->XOR '---- ... - v v v - ___plaintext____|___plaintext____|___plaintext____|... -``` - -This does make sense, however i noticed that all but the first block do not -depend on IV. This turns out useful since we can turn any block into any other -block by applying a chosen value with XOR. So we can control the ciphertext with -the IV directly as follows: - -- $m$: first plaintext block -- $c$: first ciphertext block - -$$ c = AES(m \oplus IV) \\ - -AES^{-1}(c) = m \oplus IV \\ - -AES^{-1}(c) \oplus m = IV $$ - -All blocks in ciphertext after the first are now "uncontrollable" because IV and -plaintext are set. - -## 2. JPEG - -JPEG consists of a list of _segments_. Each starts with a marker byte (`ff`) -followed by a identifier and the length of the segment (if non-zero). - -| Identifier | Name | -| ---------- | ----------------------------------------------- | -| `d8` | Start of Image | -| `fe` | Comment | -| `d9` | End of Image | -| ... | _a bunch more that you dont need to know about_ | - -The comment segment is perfect for embedding our ELF binary into JPEG. We can -first generate a JPEG image, then insert a _comment_ somewhere containing any -data we want. - -## 3. ELF Payload - -The binary needs to be super small so creating it "manually" was required. I -followed the guide -[Creating a minimal ELF-64 file by tchajed](https://github.com/tchajed/minimal-elf/) -and recreated it for my needs. Like in the guide i also wrote the assembly with -a rust EDSL. - -```rs -let mut str_end = a.create_label(); -let mut filename = a.create_label(); -a.jmp(str_end)?; // jump over the string -a.set_label(&mut filename)?; -a.db(b"flag.txt\0")?; -a.set_label(&mut str_end)?; - -// open("flag.txt", O_RDONLY) -a.mov(eax, 2)?; -a.lea(rdi, ptr(filename))?; -a.mov(rsi, 0u64)?; -a.syscall()?; // fd -> rax - -// sendfile(1, rax, 0, 0xff) -a.mov(rsi, rax)?; -a.mov(eax, 40)?; -a.mov(rdi, 1u64)?; -a.mov(rdx, 0u64)?; -a.mov(r10, 0xffu64)?; -a.syscall()?; -``` - -I was able to produce a 207 byte long executable. - -## 4. _magic!_ - -Here is how we align the files now (single quotes `'` indicate ASCII -representation for clarity, question marks `?` represent data that is implicitly -defined): - -``` -plaintext: 7f 'E 'L 'F 02 01 01 00 00 00 00 00 00 00 00 00 -iv: ?? ?? ?? ?? ?? ?? 'D 'U 'C 'T 'F 00 00 00 00 00 -ciphertext: ff d8 ff fe {len} ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? -``` - -This scenario is a little more complicated because in some places we define -ciphertext and plaintext and in some we define ciphertext and IV. This is not a -problem because XOR operates on every byte individually. - -All-in-all, the file looks like this now: - -- First block: - - overlapping headers (6 bytes) - - `DUCTF` string (5 bytes) - - padding (5 bytes) -- Rest of the ELF binary -- _end of the jpeg comment_ -- Rest of the JPEG image - -All this information should be enough to solve this challenge! - -I here attach the implementation that i created _during_ the CTF here. I kept it -as messy as it was, just removed not-so-interesting code and added comments. - -```rs -let mut i = image::RgbImage::new(13, 37); -// jpeg is lossy so filling guarantees the pixel actually has the *exact* color -i.pixels_mut().for_each(|p| *p = Rgb([7, 7, 7])); - -let mut imgbuf = Cursor::new(vec![]); -image::codecs::jpeg::JpegEncoder::new(&mut imgbuf) - .encode_image(&i) - .unwrap(); - -let key = Aes128::new(&GenericArray::from(*b"downunderctf2022")); -let mut payload = create_elf_payload(); -let mut buf = BytesMut::new(); - -// pad payload so AES works -while payload.len() % 16 != 0 { - payload.put_u8(0); -} -assert!(payload.len() % 16 == 0); - -let prefix_len = 6; -// write the JPEG headers to start a comment -buf.put_u16(0xffd8); -buf.put_u16(0xfffe); -buf.put_u16(payload.len() as u16 + 2 /*seg len*/ - prefix_len as u16); - -// find a good IV -let iv = { - let mut fblock_target = vec![0u8; 16]; - fblock_target[0..prefix_len].copy_from_slice(&buf[0..prefix_len]); - key.decrypt_block(GenericArray::from_mut_slice(&mut fblock_target)); - let mut iv = xor_slice(&fblock_target, &payload[0..16]); - iv[prefix_len..prefix_len + 5].copy_from_slice(b"DUCTF"); - iv -}; - -// fill the first block up with zeroes -for _ in prefix_len..16 { - buf.put_u8(0); -} - -// encrypt starting with the 2nd block -buf.put(&payload[16..]); -let block_range = |n: usize| (n) * 16..(n + 1) * 16; -for i in 1..buf.len() / 16 { - let tb = Vec::from(&buf[block_range(i - 1)]); - xor_assign(&mut buf[block_range(i)], &tb); - key.encrypt_block(GenericArray::from_mut_slice(&mut buf[block_range(1)])); -} - -// append the rest of the image, stripping the first segment -buf.put(&imgbuf.into_inner()[2..]); - -// pad the final buffer again because the image might not be aligned -while buf.len() % 16 != 0 { - buf.put_u8(0); -} -assert!(buf.len() < 1337); - -File::create("image").unwrap().write_all(&buf).unwrap(); -File::create("iv").unwrap().write_all(&iv).unwrap(); -``` - -I am also still looking for team mates for upcoming CTF events and would be -happy to hack together with you! Just [contact](https://metamuffin.org/contact) me. diff --git a/content/articles/2022-11-07-programming-language-design.md b/content/articles/2022-11-07-programming-language-design.md deleted file mode 100644 index 674e696..0000000 --- a/content/articles/2022-11-07-programming-language-design.md +++ /dev/null @@ -1,70 +0,0 @@ -# Some Thoughts on Programming Language Design - -This is a collection of ideas to look at when inventing new langauges. - -## Other Ideas - -- The Language pushes abstraction to the limit by not noting any - hardware-related issues like memory-allocations, parallelism, heterogenous - computer architecture (CPU, GPU, …) - - requires a very "intellegent" compiler and a way to describe unknowns like - possible inputs, etc. in the language itself -- Start with assembly but add a very flexible macro system - -## Type System - -```diff -# Haskell -data LinkedList a = Nil | Cons a (Box (LinkedList a)) -data Test = Empty | Blub Int | State { x :: Int, y :: Int } -# Rust -enum LinkedList<T> { Nil, Cons(T, LinkedList<T>) } -``` - -## Memory Management - -- **Drop when out-of-scope** -- Garbage collections -- Reference counting - -## Compile-time logic - -- Annotation when calling function to be run as-far-as-possible at comptime - -```diff -fn format(template: String, args: [String]) -> String { - template.replace("@", (match, i) => args[i]) -} - -fun add(x, y) x + y - -fun main() print(format!("@ ist @; @", ["1+1", 1+1, x])) -# should expand to -fun main() print("1+1 ist 2; " ~ x)) -``` - -## Examples - -### Fizz-Buzz - -```diff -for (n in 0..100) { - if (n % (3*5) == 0) print("FizzBuzz") - else if (n % 3 == 0) print("Fizz") - else if (n % 5 == 0) print("Buzz") - else print(n) -} - - -if (true) x = 1 -if (true) { x = 1 } -``` - -``` -f(x) = 10 + g(x) -f x = 10 + g x - -main = { - -} -``` diff --git a/content/articles/2022-11-10-artist-correlation.md b/content/articles/2022-11-10-artist-correlation.md deleted file mode 100644 index d52e613..0000000 --- a/content/articles/2022-11-10-artist-correlation.md +++ /dev/null @@ -1,78 +0,0 @@ -# Correlating music artists - -I listen to a lot of music and so every few months my music collection gets -boring again. So far I have asked friends to recommend me music but I am running -out of friend too now. Therefore I came up with a new solution during a few -days. - -I want to find new music that i might like too. After some research I found that -there is [Musicbrainz](https://musicbrainz.org/) (a database of all artists and -recordings ever made) and [Listenbrainz](https://listenbrainz.org/) (a service -to which you can submit what you are listening to). Both databases are useful -for this project. The high-level goal is to know, what people that have a lot of -music in common with me, like to listen to. For that the shared number of -listeners for each artist is relevant. I use the word 'a listen', to refer to -one playthrough of a track. - -## The Procedure - -### Parse data & drop unnecessary detail - -All of the JSON files of listenbrainz are parsed and only information about how -many listens each user has submitted for what artist are kept. The result is -stored in a B-tree map on my disk (the -[sled library](https://crates.io/crates/sledg) is great for that). - -- First mapping created: `(user, artist) -> shared listens`. -- (Also created a name lookup: `artist -> artist name`) - -The B-Tree stores values ordered, such that i can iterate through all artists of -a user, by scanning the prefix `(user, …`. - -### Create a graph - -Next an undirected graph with weighted edges is generated where nodes are -artists and edges are shared listens. For each user, each edge connecting -artists they listen to, the weight is incremented by the sum of the logarhythms -of either one's playthrough count for that user. This means that artists that -share listeners are connected and because of the logarhythms, users that listen -to an artist _a lot_ won't be weighted proportionally. - -Mapping: `(artist, artist) -> weight`. (Every key `(x, y)` is identical with -`(y, x)` so that edges are undirectional.) - -### Query artists - -The graph tree can now be queried by scanning with a prefix of one artist -(`("The Beatles", …`) and all correlated artists are returned with a weight. The -top-weighted results are kept and saved. - -### Notes - -Two issues appeared during this project that lead to the following fixes: - -- Limit one identity to 32 artists at most because the edge count grows - quadratically (100 artists -> 10000 edges) -- When parsing data the user id is made dependent of the time to seperate arists - when music tastes changing over time. Every 10Ms (~4 months) the user ids - change. - -## Results - -In a couple of minutes I rendered about 2.2 million HTML documents with my -results. They are available at `https://metamuffin.org/artist-correl/{name}`. -Some example links: - -- [The Beatles](https://metamuffin.org/artist-correl/The%20Beatles) -- [Aimer](https://metamuffin.org/artist-correl/Aimer) -- [Rammstein](https://metamuffin.org/artist-correl/Rammstein) -- [Mitski](https://metamuffin.org/artist-correl/Mitski) - -## Numbers - -- Musicbrainz: 15GB -- Listenbrainz: 350GB -- Extracted listening data: 23GB -- Graph: 56GB -- Rendered HTML: 2.3GB -- Compressed HTML (squashfs with zstd): 172MB |