init astro

This commit is contained in:
Michael Zhang 2023-08-30 19:30:45 -05:00
parent 98d3dddc41
commit 4b853c6c86
234 changed files with 7394 additions and 58487 deletions

.gitignore vendored
View File

@ -1,12 +1,21 @@
# build output
# generated types
# dependencies
# logs
# environment variables
# macOS-specific files

.vscode/extensions.json vendored Normal file
View File

@ -0,0 +1,4 @@
"recommendations": ["astro-build.astro-vscode"],
"unwantedRecommendations": []

.vscode/launch.json vendored Normal file
View File

@ -0,0 +1,11 @@
"version": "0.2.0",
"configurations": [
"command": "./node_modules/.bin/astro dev",
"name": "Development server",
"request": "launch",
"type": "node-terminal"

View File

@ -1,5 +0,0 @@
hugo serve --bind --port 8313 --buildDrafts
wget --spider -r -nd -nv -H -l 1 http://localhost:1313

View File

@ -1,14 +1,68 @@
# Blog
# Astro Starter Kit: Blog
![Build status](
npm create astro@latest -- --template blog
Powers [][1]. Public replies at [~mzhang/public-inbox][2].
[![Open in StackBlitz](](
[![Open with CodeSandbox](](
[![Open in GitHub Codespaces](](
Standard hugo site.
> 🧑‍🚀 **Seasoned astronaut?** Delete this file. Have fun!
Code License: GPL3
Content License: CC BY-SA 4.0
- ✅ Minimal styling (make it your own!)
- ✅ 100/100 Lighthouse performance
- ✅ SEO-friendly with canonical URLs and OpenGraph data
- ✅ Sitemap support
- ✅ RSS Feed support
- ✅ Markdown & MDX support
## 🚀 Project Structure
Inside of your Astro project, you'll see the following folders and files:
├── public/
├── src/
│   ├── components/
│   ├── content/
│   ├── layouts/
│   └── pages/
├── astro.config.mjs
├── package.json
└── tsconfig.json
Astro looks for `.astro` or `.md` files in the `src/pages/` directory. Each page is exposed as a route based on its file name.
There's nothing special about `src/components/`, but that's where we like to put any Astro/React/Vue/Svelte/Preact components.
The `src/content/` directory contains "collections" of related Markdown and MDX documents. Use `getCollection()` to retrieve posts from `src/content/blog/`, and type-check your frontmatter using an optional schema. See [Astro's Content Collections docs]( to learn more.
Any static assets, like images, can be placed in the `public/` directory.
## 🧞 Commands
All commands are run from the root of the project, from a terminal:
| Command | Action |
| :------------------------ | :----------------------------------------------- |
| `npm install` | Installs dependencies |
| `npm run dev` | Starts local dev server at `localhost:4321` |
| `npm run build` | Build your production site to `./dist/` |
| `npm run preview` | Preview your build locally, before deploying |
| `npm run astro ...` | Run CLI commands like `astro add`, `astro check` |
| `npm run astro -- --help` | Get help using the Astro CLI |
## 👀 Want to learn more?
Check out [our documentation]( or jump into our [Discord server](
## Credit
This theme is based off of the lovely [Bear Blog](

Fork Awesome 1.2.0
License -
Copyright 2018 Dave Gandy & Fork Awesome
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
@import "variables";
@import "mixins";
@import "functions";
@import "path";
@import "core";
@import "larger";
@import "fixed-width";
@import "list";
@import "bordered-pulled";
@import "animated";
@import "rotated-flipped";
@import "stacked";
@import "icons";
@import "screen-reader";

View File

$fa-font-path: "/fork-awesome";
@import "./fork-awesome/fork-awesome.scss";
@import "grid";
@import "agda";
@import "mixins";
@import "fonts";
$sansfont: "Inter", "Helvetica", "Arial", "Liberation Sans", sans-serif;
$monofont: "PragmataPro Mono Liga", "Roboto Mono", "Roboto Mono for Powerline", "Inconsolata", "Consolas", monospace;
// colors
@media (prefers-color-scheme: light) {
$background-color: white;
$faded-background-color: darken($background-color, 10%);
$shadow-color: darken($background-color, 10%);
$heading-color: darken(royalblue, 10%);
$text-color: #15202B;
$small-text-color: #6e707f;
$smaller-text-color: lighten($text-color, 30%);
$faded: lightgray;
$hr-color: lightgray;
$link-color: royalblue;
$code-color: firebrick;
$tag-color: lighten($link-color, 35%);
@import "syntax";
@import "content";
@media (prefers-color-scheme: dark) {
$background-color: #202030;
$faded-background-color: lighten($background-color, 10%);
$shadow-color: lighten($background-color, 10%);
$heading-color: lighten(lightskyblue, 20%);
$text-color: #CDCDCD;
$small-text-color: darken($text-color, 8%);
$smaller-text-color: darken($text-color, 12%);
$faded: #666;
$hr-color: gray;
$link-color: lightskyblue;
$code-color: lighten(firebrick, 25%);
$tag-color: darken($link-color, 55%);
@import "syntax";
@import "content";

import { defineConfig } from 'astro/config';
import mdx from '@astrojs/mdx';
import sitemap from '@astrojs/sitemap';
export default defineConfig({
site: '',
integrations: [mdx(), sitemap()],

@ -1,3 +0,0 @@
name: blog
depend: standard-library cubical
include: content/posts

languageCode = "en-us"
title = "Michael's Blog"
enableGitInfo = true
ignoreFiles = ["logseq"]
enableEmoji = true
tag = "tags"
language = "languages"
endLevel = 4
ordered = false
startLevel = 2
unsafe = true
anchorLineNos = true
codeFences = true
guessSyntax = false
hl_Lines = ''
lineAnchors = ''
lineNoStart = 1
lineNos = true
lineNumbersInTable = true
noClasses = false
style = 'monokai'
tabWidth = 4

@ -1,80 +0,0 @@
title = "About"
weight = 2
type = "generic"
layout = "single"
# About Me
{{< left-nav >}}
<!-- more -->
### Research
Currently, I'm learning about [cubical type theory][cubical]. I'll probably
write some blog posts as I learn more. My advisor is [Favonia].
### University Involvement
During my time at the University of Minnesota, I like to be actively involved in
computing related student groups.
- **[GopherHack]**. I'm one of the founding officers at the GopherHack
organization, hoping to grow a CTF community at the University. I prepare
material for club activities.
- **[PL Seminar]**. A group focused on reading and discussing programming
languages related papers.
- **[UMN Kernel Object]**. A group dedicated to studying operating system
development, created in the aftermath of the UMN Linux kernel controversy.
Previously, I was also involved with:
- **[ACM]**. I was webmaster and wrote the current website, as well as helping
out with other events such as CTF.
- **[SASE]**. I was webmaster and was involved in organizing student group
events as well.
### Open-source Projects
Some of the projects I've been working on in my free time include:
- **[Wisesplit]**. A tool for easily splitting the bill with friends.
- **[Garbage]**. A CLI interface to the trash can API.
- **[Leanshot]**. A Linux screen capture tool.
More can be found on [this page][12] or my public [Gitea][2].
I've also started making an increased effort at using and supporting [FOSS], and
other software that're not predatory towards users. As a part of this effort,
I'm also self-hosting and rewriting some of the services and software that I use
regularly. Find out what I'm using [here][9].
### Hobbies
I'm also an avid rhythm game player and beatmap creator, mostly involved with
the free-to-play game [osu!]. Check out some of my beatmaps on my osu!
I also enjoy playing badminton &#x1F3F8; at the rec.
[9]: setup
[10]: pgp.txt
[12]: ../projects
[pl seminar]:
[umn kernel object]:

@ -1,78 +0,0 @@

View File

@ -1,12 +0,0 @@
title = "random scripts"
### convert a bunch of `flac`s to `mp3`
function flac2mp3() { ffmpeg -y -i "$1" -acodec libmp3lame "$(basename "$1")".mp3; }
export -f flac2mp3 # only works in bash
fd "\.flac$" | parallel flac2mp3

View File

@ -1,48 +0,0 @@
title = "Setup"
# Setup
List of software and services I use and endorse, mostly FOSS.
## Desktop
- [**Arch Linux**]( OS with rolling releases.
- [**Home manager** (MIT)]( Dotfile manager.
- [**Firefox** (MPL-2.0)]( Browser.
- [**Thunderbird** (MPL-2.0)]( Email + calendar client.
## Development
- [**Neovim** (Apache-2.0/Vim)]( Text editor.
## Server
- [**NixOS** (MIT)]( Declarative and reproducible operating system.
- [**Hugo** (Apache-2.0)]( Static site generator that powers this site.
- [**Gitea** (MIT)]( Self-hosted git.
## Mobile
- [**DAVx5** (GPL-3.0)]( CalDAV and CardDAV sync for Android.
- [**Gadgetbridge** (AGPL-3.0)]( Smartwatch client.
- [**K-9 Mail** (Apache-2.0)]( Mail client.
- [**Feeder** (GPL-3.0)]( RSS aggregator.
## Music
- [**Navidrome** (GPL-3.0)]( Self-hosted Subsonic-compatible streaming server.
- [**Sublime Music** (GPL-3.0)]( GTK Subsonic-compatible music client.
- [**Subtracks** (GPL-3.0)]( Android Subsonic-compatible music client.
## Services
- [**SourceHut** (AGPL-3.0)]( Git, mailing list, IRC bouncer, etc. hosting.
- [**Element**]( Federated chat provider.
- [**ProtonMail** (Proprietary)]( Encrypted email.
- [**Signal** (GPL-3.0/AGPL-3.0)]( Encrypted chat.
## Games
Mostly from Steam.

@ -1,9 +0,0 @@
title = "Drafts"
weight = 1
hidden = true
type = "drafts"

@ -1,56 +0,0 @@
title = "My new life stack"
date = 2018-02-01
tags = ["arch", "linux", "setup", "computers"]
This is my first post on my new blog! <!--more--> I used to put a CTF challenge writeup here but decided to change it up a bit. Recently, I've been changing a lot of the technology that I use day to day. Here's some of the changes that I've made!
## Operating System
I've ran regular Ubuntu on my laptop for a while, then switched to Elementary OS, which I found a lot more pleasing to use. After using Elementary OS for about 6 months, some of the software on my computer started behaving strangely, and I decided it was time for some change.
# michael @ arch in ~ [3:20:09]
$ screenfetch
.o+` michael@arch
`ooo/ OS: Arch Linux
`+oooo: Kernel: x86_64 Linux 4.14.15-1-ARCH
`+oooooo: Uptime: 6h 3m
-+oooooo+: Packages: 546
`/:-:++oooo+: Shell: zsh 5.4.2
`/++++/+++++++: Resolution: 1920x1080
`/++++++++++++++: WM: i3
`/+++ooooooooooooo/` CPU: Intel Core i7-6500U @ 4x 3.1GHz [37.0°C]
./ooosssso++osssssso+` GPU: intel
.oossssso-````/ossssss+` RAM: 2963MiB / 7872MiB
-osssssso. :ssssssso.
:osssssss/ osssso+++.
/ossssssss/ +ssssooo/-
`/ossssso+/:- -:/+osssso+-
`+sso+:-` `.-/+oso:
`++:. `-/+/
.` `/
I installed Arch Linux on my laptop the day before yesterday. I've used Arch Linux before, about a year ago, so setup was relatively familiar. On top of Arch Linux, I'm using the very widely recommended i3 tiling window manager, and urxvt terminal emulator.
## Code Editor
I usually use Sublime Text and Visual Studio Code (VSCode) equally much. Both editors are extremely customizable (and VSCode seems to be heavily inspired from Sublime), but when it comes down to it, VSCode doesn't outperform Sublime. There are many occasions when VSCode just takes forever (for example, when trying to open large codebases, and then automatically running static analyzers over the entire thing).
Since I started using Arch Linux, I've been trying out neovim. I'm packing my configuration with plugins, and seeing how well it works out as my main code editor. If I get really comfortable with it, I'll share my init file on a Git repo, probably.
## Browser... ??
I used to use Chromium, and ..I still do. I've tried several alternatives, like Firefox or even Vivaldi, but all of them seem to be missing something. I haven't tried the new Firefox Quantum yet, but unless there's a really big reason for me to change my browser, I'm probably going to stick to Chrome for a while. Chrome's DevTools are by far the best I've used, and its general ease of use makes it my favorite browser.
### cVim
[cVim]( is a nice Chrome extension that provides vim-like keyboard bindings to Chrome. I'm going to have to admit that there's a lot of quirks around using it on pages that have heavy key bindings, but ever since I started using it, I can't help but use j/k scrolling and H/L for back and forth navigation!
I got a droplet off DigitalOcean for hosting things that I regularly depend on. In fact, this blog (running Ghost) is hosted there now! I'm also hosting a Git server over at []( It's running Gitea, a Go-based GitHub alternative. This doesn't mean I'm completely ditching GitHub, I just have things that I _really_ want to keep private, private.

View File

@ -1,12 +0,0 @@
title = "Cleaning up your shell"
date = 2018-02-25
tags = ["computers", "linux", "terminal"]
languages = ["bash"]
Is your shell loading slower than it used to? Maybe you've been sticking a bit more into your `.bashrc`/`.zshrc` than you thought. <!--more-->
It's only been a couple weeks since I installed my computer, and already my shell has been starting to lag. Since there's not that much I've put into my `.zshrc` file, I knew who the main culprits were. Namely, oh-my-zsh's "git" plugin and the nvm (node version manager) trying to load itself on startup. I'm not exactly in a situation where I need nvm most of the time I open my shell, so getting rid of that made my shell load a lot faster. It also means that every time I want to use node or npm, I'd have to manually call nvm, but that's not as important to me as a faster shell load time, especially since I don't really touch node that much.
One trick you can use to see what scripts are being called at startup is the `-x` option (stands for xtrace) that popular shells like `bash` and `zsh` support. If you go into your shell and run `set -o xtrace`, you'll see it start to spit out some bash commands; this is the list of everything that is being run when your shell starts. You might find that some apps take a ridiculous amount of time to start up. These are some of the things you'd want to eliminate.

@ -1,13 +0,0 @@
title = "Fixing tmux colors"
date = 2018-04-23
tags = ["computers"]
Put this in your `~/.tmux.conf`.
set -g default-terminal "screen-256color"
If this isn't set properly, tmux usually assumes 16-color mode, which displays colors probably not like what you're used to.

@ -1,49 +0,0 @@
title = "Web apps"
date = 2018-05-28
tags = ["computers", "web", "things-that-are-bad"]
languages = ["javascript"]
The other day, I just turned off JavaScript from my browser. <!--more--> "fucking neckbeard", "you'll turn it back in 2 weeks", "living without JavaScript is like living without electricity" were some of the responses I got. And they might be right. But let's see why things are the way they are and what we can do about it.
## What is the purpose of the web?
Well, the answer's pretty obvious, right? So you can surf it. But what does that even mean anymore? In the past, surfing the web meant viewing websites. You'd open something like your favorite news website, and it'd show you some of the latest updates. Or maybe you'd open the website for some company to find out their telephone so you can contact them. In other words, it was a channel from which you could receive information.
If you wanted to do anything more complicated or that required more interaction, like sending an email, you'd probably pop open a dedicated client to do it. Things like Microsoft Outlook, Mozilla Thunderbird serve as great email clients. For chat, you could use an IRC client. Hell, even the browser was a client, just for viewing webpages. If you didn't have a client for a service that you wanted to use, you'd download a client, enter in the details of the server you want to connect to, and then you would be off.
Things aren't that way anymore. For some reason, the web browser has become the all-in-one client for every service. Instead of simply acting as a HTTP client, your browser is now also capable of running full-blown 3D games, chat rooms, real-time word processors, and [full x86 emulators, apparently]( What the hell happened?
## Spoiler alert: Javascript happened
JavaScript happened. That little _scripting_ language invented to, you know, make some hover animation on your page or have dropdowns on your menu bar. Thanks to the introduction of JavaScript (and jQuery especially), developers stopped viewing webpages as Word documents that you can share, and more like canvases. Hover animations are cute and dropdowns are useful. Sure. But when this _scripting_ language starts turning into a _systems_ language (for lack of a better term), you have a problem. When's the last time you used Perl to write an operating system?
Look at the things we do today with JavaScript. We have _full blown frameworks_ that we _compile_ into bundles of _executable code_ in people's browsers. We're basically talking about the equivalent of downloading a binary and executing it on your computer every time you open a _webpage_. Except for a few minor differences. Firstly, it's not really a binary, it's a huge blob of script, which means it must be executed inside some virtual interpreter. For each tab that you're running. Secondly, now you're downloading random scripts from any website that you open, and then [trusting it]( and [running it][1]. You wouldn't hesitate to click a link, but you'd definitely think twice before installing something from an unknown source into your computer, right?
On top of that, look at these huge frameworks that almost every company is hiring developers for: React, Angular, Vue. These frameworks help JavaScript developers develop "web apps", meaning your JavaScript is now responsible for things the browser should actually be doing for you: two-way data binding, template rendering, and more. Except now, you're downloading a script and running it inside of a virtual interpreter. And because of technologies like Webpack that bundles all your separate code files together (read: static linking), our browsers can't even use the same framework code from site to site.
Look at Facebook's home page. Just from regular use, that webpage itself can use over 4 gigabytes of RAM. It makes large amounts of network calls for data that's all just being stored in memory. And everyone who opens the Facebook website (for the first time) must download _all_ of that JavaScript. The website has its own tabs (within the page, yes) for chat windows, games, advertisements, embedded video players, and much more I probably didn't even know about. Why are we running full-blown apps in our browser?
## Ok but what can i do
There's a number of things that can be done to turn this state of the web down a different path. Here's some ideas for users:
- Disable JavaScript in your browser. Grant websites permission to use JavaScript only if they need it. You'd be surprised how many sites work with JavaScript disabled.
- If you're not ready to do that yet or don't want to, consider [uBlock Origin]( It's an extension that can block scripts by source.
For developers:
- [svelte]( is a cool alternative to frameworks like Angular or React.
- Consider the impact of every library you include. Can you do without it? What if you just wrote something from scratch instead of importing a full framework to do it?
- Write more non-JavaScript software/libraries. Developers have only turned towards sticking JavaScript everywhere because it's easy to use, and libraries are readily available through npm.
## Ok but what can you do
I'm helping with a project called flubber, which originated as an IRC bouncer, but is slowly turning into a general messaging protocol. All-in-one messengers exist (and a particular one exists by that name exactly), but they all work by opening a browser view and just loading the page within it, so it's no different from just opening tabs in a browser. Flubber will communicate with these services through APIs, and then expose a uniform interface to clients which makes it easy to bring all into a single view. Check out my progress [here]( Other than that, I'm also working on making my websites as light as possible in general, including this one which has no required Javascript (some pages use Katex for displaying math elements but are still readable without).
And of course, I've disabled JavaScript in my browser.
\</rant\> <small>thanks for reading!</small>

@ -1,86 +0,0 @@
title = "Setting up IRC with Weechat"
date = 2018-10-18
tags = ["irc"]
I've just recently discovered that weechat has a "relay" mode, which means it can act as a relay server to other clients (for example, my phone). If I leave an instance of weechat running on, say, my server that's always running, it can act as a bouncer and my phone can receive notifications for highlights as well.
The android app I'm using is called [Weechat-Android][2]. On my laptop I'm using [Glowing Bear][5].
## Step 1: tmux
To achieve this setup, first I install [tmux][1], which separates the terminal from the session. This means I can leave the weechat instance running in the background and detach my current session from it. The command for this is:
$ tmux new-session -s weechat
where the `-s` option just names the tmux session so it's not assigned some number.
## Step 2: Add relay
Now add a relay through weechat:
/relay add <name> <port>
where name is
ipv4: force use of IPv4
ipv6: force use of IPv6
ssl: enable SSL
according to the [documentation][3].
## Step 2.5: SSL
I'm using SSL on my relay endpoint, and I'd recommend anyone else to use it to. You could follow what the documentation says and generate a self-signed certificate, but getting a trusted certificate with [LetsEncrypt][4] is so easy there's almost no excuse not to do it.
To start, install certbot, which is LetsEncrypt's handy bot that does everything for you. Once you're ready, run:
$ sudo certbot certonly <domain>
We want the `certonly` option because by default, certbot will try to install it into an existing HTTP server, but we're not using it for HTTP. This command should dump some files into `/etc/letsencrypt/live/<domain>`.
Finally, just concatenate the important files, `privkey.pem` and `fullchain.pem` in that order, into `~/.weechat/ssl/relay.pem` (you can change that path with `/set`). The file should look like:
If your private key file starts with `BEGIN CERTIFICATE`, just change that to `BEGIN PRIVATE KEY` (change the END one too) and it should be fine.
## Step 3: Set password
Since weechat 1.6, the option to not use a password has been removed. So in order for clients to be able to connect to the server, you must set one using:
/set <password>
The password should appear in asterisks in the weechat prompt box.
## Step 4: Connect
This depends on your setup, but you must make sure that your setup is reachable from the outside. Make sure the port that you chose for the relay is accessible through firewalls.
That's it! If you're also using the android app to connect, just type in your host's address and password and you should be all good to go.

@ -1,242 +0,0 @@
title = "Twenty years of attacks on rsa with examples"
date = 2018-10-26
toc = true
tags = ["ctf", "crypto", "rsa"]
languages = ["python"]
math = true
There's [a great paper][1] I found by Dan Boneh from 1998 highlighting the
weaknesses of the RSA cryptosystem. I found this paper to be a particularly
enlightening read (and interestingly enough, it's been 20 years since that
paper!), so here I'm going to reiterate some of the attacks described in the
paper, but using examples with numbers in them. <!--more-->
That being said, I _am_ going to skip over the primer of how the RSA
cryptosystem works, since there's already a great number of resources on that.
## Factoring large integers
Obviously this is a pretty bruteforce-ish way to crack the cryptosystem, and
probably won't work in time for you to see the result, but can still be
considered an attack vector. This trick works by just factoring the modulus,
$N$. With $N$, finding the private exponent $d$ from the public exponent $e$ is
a piece of cake.
Let's choose some small numbers to demonstrate this one (you can follow along in
a Python REPL if you want):
>>> N = 881653369
>>> e = 17
>>> c = 875978376
$N$ is clearly factorable in this case, and we can use resources like
[msieve][7] or [factordb][2] to find smaller primes in this case. Since we know
now that $N = 20717 \times 42557$, we can find the totient of $N$:
>>> p = 20717
>>> q = 42557
>>> tot = (p - 1) * (q - 1)
Now all that's left is to discover the private exponent and solve for the
original message! (you can find the modular inverse function I used [here][3])
>>> d = modinv(e, tot)
>>> pow(c, d, N)
And that's it! Now let's look at some more sophisticated attacks...
## Elementary attacks
These attacks are related to the _misuse_ of the RSA system. (if you can't tell,
I'm mirroring the document structure of the original paper)
### Common modulus
My cryptography professor gave this example as well. Suppose there was a setup
in which the modulus was reused, maybe for convenience (although I suppose with
libraries today, it'd actually be more _inconvenient_ to reuse the key). Key
pairs would be issued to different users and they would share public keys with
each other and keep private keys to themselves.
The problem here is if you have a key pair, and you got someone else's public
key, you could easily derive the private key by just factoring the modulus.
Let's see how this works with a real example now.
Since this is a big problem if you were to really use this cryptosystem, I'll be
using actual keys from an actual crypto library instead of the small numbers
like in the first example to show that this works on 2048-bit RSA. The library
is called [PyCrypto][4], and if you're planning on doing anything related to
crypto with Python, it's a good tool to have with you. For now, I'm going to
generate a 2048-bit key (by the way, in practice you probably shouldn't be using
2048-bit keys anymore, I'm just trying to spare my computer here).
>>> from Crypto.PublicKey import RSA
>>> k1 = RSA.generate(2048)
<_RSAobj @0x7f3d3226dfd0 n(2048),e,d,p,q,u,private>
Now, normally when you generate a new key, it'd generate a new modulus. For the
sake of this common modulus attack, we'll force the new key to use the same
modulus. This also means we'll have to choose an exponent $e$ other than the
default choice of 65537 (see [this link][5] for documentation):
>>> N = k1.p * k1.q
>>> e = k1.e
>>> d = k1.d
>>> e2 = 65539
>>> d2 = modinv(e2, (k1.p - 1) * (k1.q - 1))
>>> k2 = RSA.construct((N, e2, d2))
<_RSAobj @0x7f3d31c7c5f8 n(2048),e,d,p,q,u,private>
Ok, now we have two keys, $k_1$ and $k_2$. Now I'll show how using only the public
and private key of $k_1$ (assuming this is the pair that we got legitimately from
the crypto operator), and the public key of $k_2$, which is tied to the same
modulus, we can find the private key of $k_2$.
To do this, we'll try to find the roots of the equation:
$$ f(x) = x^2 - (p + q)x + pq $$
You'll find that for values of $p$ and $q$, this will produce $f(p) = p^2 - p^2
\- qp + pq$, and $f(q) = q^2 - pq - q^2 + pq$. We know that $N = pq$. How can we
find $p + q$? Since $\phi(N) = (p - 1)(q - 1) = pq - p - q + 1$, we can find
that $\phi(N) = N - (p + q) + 1$, so $p + q = N - \phi(N) + 1$.
Now we need to use $e$ and $d$ to estimate $\phi(N)$. Recall that $ed = 1 \mod
\phi(N)$. This is equivalent to saying $ed = 1 + k\phi(N)$. Then $\frac{ed -
1}{\phi(N)} = k$.
It turns out that $k$ is extremely close to $\frac{ed}{N}$:
$$ \frac{ed}{N} = \frac{1 + k\phi(N)}{N} = \frac{1}{N} + \frac{k\phi(N)}{N} $$
$\frac{1}{N}$ is basically 0, and $\phi(N)$ is very close to $N$, so it
shouldn't change the value of $k$ by very much. We now use $\frac{ed}{N}$ to
estimate $k$:
$$ \phi(N) = \frac{ed - 1}{\frac{ed}{N}} $$
>>> from decimal import Decimal, getcontext
>>> getcontext().prec = 1000
>>> k = round(Decimal(e) * Decimal(d) / Decimal(N))
>>> phi = (Decimal(e) * Decimal(d) - 1) / Decimal(k)
Then we can get $p + q$ through the formula mentioend above:
>>> B = Decimal(N) - phi + 1
>>> C = Decimal(N)
Check to make sure $B$ and $C$ are integers. If they're not, try using a higher
precision in `getcontext().prec`. Now solve the quadratic equation:
>>> p = (B + (B * B - 4 * C).sqrt()) / Decimal(2)
>>> q = (B - (B * B - 4 * C).sqrt()) / Decimal(2)
>>> p * q == N
We've successfully recovered $p$ and $q$ from just $N$, $e$, and $d$!
### Blinding
This attack is actually about RSA _signatures_ (which uses the opposite keys as
encryption: private for signing and public for verifying), and shows how you can
compute the signature of a message $M$ using the signature of a derived message
Suppose Marvin wants Bob to sign the following message: `"I (Bob) owes Marvin
$100,000 USD"`. Marvin hands this to Bob saying something like, "I'll just need
you to sign this with your private key." Let's generate Bob's private key:
>>> from Crypto.Util.number import bytes_to_long, long_to_bytes
>>> from Crypto.PublicKey import RSA
>>> bob = RSA.generate(2048)
<_RSAobj @0x7f4309521128 n(2048),e,d,p,q,u,private>
>>> M = b"I (Bob) owes Marvin $100,000 USD"
Obviously, Bob, an intellectual, will refuse to sign the message. However,
suppose Marvin now transforms his message into a more innocent looking one. He
does this by turning $M$ into $M' = r^eM \mod N$ where r is an integer that's
coprime to $N$:
>>> from random import randint
>>> N = bob.p * bob.q # this is publicly available knowledge
>>> r = 19
>>> Mp = long_to_bytes((pow(r, bob.e, N) * bytes_to_long(M)) % N)
Now he asks Bob to sign this more... innocently-looking message. Without
questioning, Bob, an intellectual, signs his life away. Let's say he produces a
S' &= (M'^d) \\\
&= (r^e * M)^d \\\
&= r^{ed} * M^d \\\
&= r * M^d \mod N
>>> Sp, = bob.sign(Mp, 0)
Now, all Marvin has to do is multiply by the modular inverse of $r$, to obtain
$M^d$, the signature of the original message:
>>> S = (Sp * modinv(r, N)) % N
Sure enough, if you try to verify the "original" signature against the original
message, it checks out.
>>> bob.verify(M, (S,))
Marvin has now successfully tricked Bob into signing his life away.
This post is a work in progress.. I'll update it as I add more.

@ -1,90 +0,0 @@
title = "Magic forms with proc macros: Ideas"
date = 2019-02-01
tags = ["computers", "web"]
languages = ["rust"]
Procedural macros (proc macros for short) in Rust are incredible because they allow arbitrary pre-compile source transformation, which leads to endless possibilities (and hazards!). But if we take careful advantage of this feature, we can use it to make clean abstractions for messy boilerplate, especially in the case of web forms. <!--more-->
In fact, proc macros are incredibly pervasive around Rust's ecosystem. For example, using the [`serde`][1] serialization/deserialization crate, you can simply write:
struct Foo {
bar: String,
and code will be generated to serialize and deserialize to a multitude of formats including JSON, YAML, CBOR, etc.
It occurred to me that this feature can also be useful for generating code for rendering and validating forms (as in a place where you fill out info). **wtforms** is one of the nicest Python packages for handling form behavior in web applications, and with the power of proc macros, this functionality can be easily achieved in Rust as well.
In this post I'm going to outline some of the ideas I have for a wtforms-ish library for handling forms in Rust.
## Code generation
Ideally, we should be able to use this library like this:
struct RegisterForm {
#[validators(email, custom("not_taken"))]
id: Email,
#[validators(required, length(4, 12))]
name: String,
#[validators(required, length(8, 128))]
pass: Password,
What this would do is add a couple more functions to our form class. Firstly, I'd like to render an HTML version of the above form. Calling something like `RegisterForm::html()` should produce the following HTML (prettified here for convenience):
<input type="email" name="id" />
<input type="text" name="name" />
<input type="password" name="pass" />
<input type="submit" />
If we were to want to customize our form in any way, for example, adding more attributes to the elements, we would just attach that as a separate attribute onto the field:
#[validators(required, length(4, 12))]
#[attrs = "autocomplete=off"]
name: String,
This should generate the following HTML:
<input type="text" name="name" autocomplete=off />
I realize this is probably not very flexible, since you'd really only be able to use this form in a specific context. But in reality, how much do you really lose by redefining that form?
## Validation
You've already seen the `validators` attribute used above. This defines a set of validators that we'd like to verify the form against. Suppose you receive an instance of the form that looks like (in pseudo-y Rust):
let instance = RegisterForm {
id: Email(""),
name: "michael",
pass: Password("pass"),
then calling something like `instance.verify()` should run all those validators we've defined on the fields and return a list of errors that go along with each of the fields. For this instance, for example, we should at least get an error that states that the password provided was way too short.
## Other interesting features
- If a form fails during validation, the user is presented with the errors and a chance to retry the form. At this point, the HTML generated should fill in the values for the fields that passed the validation so the user doesn't have to fill it out again. You see this behavior on web forms sometimes.
## Conclusion
View File

@ -1,20 +0,0 @@
title = "Accept server analogy"
date = 2019-03-04
tags = ["computers"]
This is just a stupid analogy I thought of recently, but decided to write about it anyway.
If you think about it, a server waiting for clients is kind of like the host at the front of a restaurant leading guests to tables. They don't actually take orders or serve food, they just stand at the front and wait for new guests to arrive. Then there's another waiter that's specifically assigned to take that table's orders.
When a server binds to, for example, `localhost:3000`, what the server really gets is a file descriptor; this is what's meant by:
int socket(int domain, int type, int protocol);
int bind(int socket, const struct sockaddr *address, socklen_t address_len);
The `listen` library call then marks this socket as one that's open for connections, similar to marking the restaurant staff as a host rather than a waiter.
According to the manpage for `accept`, when a connection-mode socket accepts a connection, it'll "extract the first connection on the queue of pending connections, create a new socket with the same socket type protocol and address family as the specified socket, and allocate a new file descriptor for that socket." This new socket would be the waiter who actually takes your orders.

@ -1,46 +0,0 @@
title = "Password managers"
date = 2020-04-01
tags = ["computers", "things-that-are-good", "privacy"]
Password managers are programs that store passwords for you. With the number of accounts you keep on the web, you generally don't want to store all of them in your head. If you want to see articles on why you should use a password manager NOW, search "reasons to use a password manager" online and any of the articles you find should explain it. Here I'll add some more commentary on top of the traditional arguments.
<!-- more -->
Don't tick the "Remember master password box" no matter what
How well you remember a password depends on how much you use it. If you open an account, make a password, and stay signed in for a year without ever having to re-login, you'll naturally forget the password. Same deal with password managers; the problem has just been moved another step.
The power of a password manager comes from you continually entering in the same password over and over in order to unlock your other accounts.
Password managers are good for a lot more than passwords
If you're willing to put sensitive passwords into your password manager, it should be a perfect place to put information that you'd want to avoid writing down in plaintext but want to access easily. This might include:
- Backup / recovery codes
- Your bank account number
- Your car's license plate number
- Answers to security questions, which leads into the next point:
Treat your security questions as passwords
Save these in your password manager! "Security" questions are probably the worst idea for security and are more likely to weaken the security of your account than strengthen it. They have multiple fatal flaws (assuming you use security questions truthfully):
- People can find out simple information about you through social engineering (favorite color, mother's maiden name, schools, etc.)
- The answers to these questions aren't likely to change, and some can't be changed at will (in the case of a security problem, for example)
- You probably won't even remember the exact format you typed in the answer, so if there's any fuzzy matching, it means the answers aren't hashed and salted to the same degree as passwords.
Instead, just treat them as another password! Go into your password manager, generate the longest possible random password that fits into the box, and save it. Since you can give a name to the password, there's no worry of forgetting it or losing it, since it'll be stored among the vault of other passwords that you're hopefully using every day.
Don't trust extensions that fill in your password automatically
Some password managers, like LastPass, have browser extensions that automatically fill in password boxes when you open the page.
**Always turn this off, if possible. Prefer to look up the password and copy it in.**
Once the extension copies the password into the page, it's fair game for any other JavaScript running on the page to grab your password. Not only that, there have been multiple reported vulnerabilities related to the LastPass extension mistakenly copying in a password because it couldn't correctly match the domain of the page to the domain of the password. Additionally, it doesn't work well if you have multiple passwords saved to the page, like if you have security questions saved to the page.

@ -1,101 +0,0 @@
title = "Tracking links in email"
date = 2021-06-17
tags = ["email", "computers", "things-that-are-bad", "privacy"]
You probably get emails every day, and spend a lot of time reading them. And
whenever someone performs an action or does something in vast quantities, you
_bet_ the data giants have figured out a way to capitalize on it. For many
years consumer privacy has basically gone unnoticed, and invasive tracking has
grown [viral][1]. <!--more-->
Arguably, if you are someone who runs a business off of writing periodic
newsletters that are distributed via email, you might want some statistics on
how your newsletter is doing. Traditionally, this is achieved **actively**
through some kind of survey with some kind of incentive, like "tell us how
we're doing for a chance to win a water bottle".
Now emails are typically imbued with **passive** trackers either in the form of
[tracking pixels][3] (which informs the sender when the receipient opens the
email) and [tracking links][4] (which informs the sender when AND what links
receipients click). Tracking pixels are usually less relevant these days since
many web-based email clients will ask before loading images, and clients run by
mail servers with an enormous number of users like Gmail ([and soon iOS][5])
may proxy the pixels ahead of time so the senders only see the IPs and metadata
of the server.
Tracking links, on the other hand, have become much more invasive, to the point
where it's impossible to avoid being tracked. You see it all over the web:
whenever you open a link, there's almost always some kind of `?ref=xxxxx` code
stuck onto the end that identifies _your_ particular instance of it. This way,
if you share the link with a friend, they just used the same code, and your
connection to your friend is traced by the website owner.
> If this creeps you out, consider using a browser extension like
> [ClearURLs][6], which recognizes these URL parameters that do nothing but
> feed information to the website owners and removes it for you.
But email tracking links are even worse: they abuse redirects to obfuscate the
original URL entirely. For instance, you'd get links in your email that look
Where does it go? Wikipedia? Piratebay? There's only one way to find out: by
making a request to that server, giving up information about the time, place,
client, OS, and all sorts of other information that greedy data collection
companies are waiting to snatch up.
Of course, regular users notice nothing: these links are usually hidden behind
buttons, text, or even the original URL itself. Once they click it, the website
silently logs all the data it receives about the user, and then redirects the
user to the original destination.
The senders usually aren't at fault either. Sending email is tricky, with all
the infrastructure set up to block out spam, so the majority of people who send
bulk mail (newsletters, websites that need to confirm your email, etc.) all go
through companies that handle this for them. Of course, being the middlemen who
actually get the mail out the door, they're free to replace the links with
whatever they want, and many of these companies advertise it as a feature to
get more "insight" into how your emails are doing.
Even worse, the original senders aren't the only ones getting the info, either.
These middlemen could hold on to the data and there's no saying they can't use
it for other purposes or sell it.
Unfortunately, sending email isn't really going to get any easier, partly
because of the way email fundamentally works: without all of the security
infrastructure in place, running your own email server could easily lead to
abuse. Most people (justifiably) would not go through all that effort
Another possible avenue of thinking is to do what large mail companies did to
oppose tracking pixels, where they would act as a mass-proxy for the links,
opening them when they receive it, and transparently replace the unfiltered
link back into the email so the user's device and location aren't revealed. But
this raises its own issues: for example, what if the act of opening the
original link performs some kind of action (e.g. click to subscribe, click to
register, etc.)? Also, this solution only works for email that is not
end-to-end encrypted. For end-to-end encrypted mail providers, there is no way
to do this.
The only real solution here is regulation via either advancement in
privacy-related open standards or legislature. It's clear that without any kind
of regulation, companies will continue to act in the interests of profit rather
than the protection of their customers.
> Devil's advocate afterthought: should this problem even be solved? Maybe
> there's a benefit to this whole tracking thing. My opinion on this is if you
> _really_ want to develop a community of readers, offer an easy way to give
> feedback (or even go back to the incentive surveys), and if people aren't
> giving feedback, then that itself is a reflection of the state of your
> readers.

View File

@ -1,410 +0,0 @@
title = "Sending https requests from scratch"
date = 2021-07-05
draft = true
toc = true
tags = ["computers", "web", "crypto"]
languages = ["python"]
Every now and then, I return to this age-old question of _exactly_ how hard would it be to write a web browser from scratch? I hear some interviewers ask their candidates to describe the process your browser takes to actually put a webpage on your screen, but no doubt that's a simplification of a process from 20 years ago. <!--more-->
Today, the specifications describing your browser's behavior [far exceeds 100 million words][4], and there's no sign of slowing. We are no longer just opening TCP sockets and sending `GET /path HTTP/1.0` anymore. That's why I decided to take some time and do some digging to see exactly how much it would take to send an HTTPS request from scratch, just like what the browser does, using as little existing tooling as I can.
> **Disclaimer:** This is a experiment for demonstration purposes. Do **NOT** use this code for any real software.
I'll be using Python for this since it's just for fun, the code will be pretty concise, and I don't have to write boilerplate outside of this post in order to make the code in it work. I'll try to stick to only using the Python 3 standard library as well, so not bringing in any external cryptography algorithms (the standard library provides `hashlib` tho). The downside here is the struct serialization and deserialization (using the [Python struct library][5]) gets a bit messy if you don't know how it works, but that information is all in the RFC anyway.
**&#x1f4a1; This is a literate document.** I wrote a [small utility][3] to extract the code blocks out of markdown files, and it should produce working example for this file. If you have the utility, then running the following should get you a copy of all the Python code extracted from this blog post:
curl -o -s {{< docUrl >}}
markout -l py >
Otherwise, you can follow along and extract the code yourself as you read.
With that out of the way, let's jump in!
## URL Parsing
This part is basically just a chore. URLs are defined in [RFC 3986][1], but we'll cheat a bit and just get the important parts we want for sending a request. First, I'll write out a regex for actually matching the parts we want:
import re
URL_PAT = re.compile(r"""
(?P<scheme>[A-Za-z]+) # scheme (http, https,...)
:// # divider
(?P<host>[A-Za-z\-\.]+) # hostname
(:(?P<port>[0-9]+))? # port
(/ # divider
(?P<path>[^?]*))? # path
""", flags = re.VERBOSE)
We'll say if a string doesn't match this regex, then we won't count it as a URL. The rest of this part is just writing some glue code turning this regex into a dictionary:
def parse_url(s: str):
m = URL_PAT.match(s)
if m is None: raise Exception("bad url")
return m.groupdict()
u = parse_url("")
# {'scheme': 'https', 'host': '', 'port': None, 'path': None}
## TLS
OK, now that we know where we're going to send the request, we should actually open a socket and talk to it. But before we want to send any data, we should _encrypt_ our communications. TLS is a protocol that conducts a brief handshake, then creates a tunnel where we can send data freely and it will be transparently encrypted before it goes over the wire. I haven't seen many example implementations of TLS out there (probably for a good reason), but without looking at actual code that works, it's hard to say I fully understand the protocol. So here I'll implement TLS 1.3 (defined in [RFC 8446][2]).
- Worth noting here that TLS uses big-endian format for numbers.
> **Second disclaimer:** hope I made it clear above but **THIS IS A TOY PROGRAM**. I'm about to roll my own crypto so do _not_ shove any of this code directly into a program if you value your safety. If you do plan on using this as a reference please get your code audited.
### Record Layer
TLS messages are sent in records, on top of TCP packets. This middle layer has its own header, described in section 5.1 of the RFC.
Not a big deal, it just means we'll want a helper function to actually send our packets through this record over the socket. The implementation is short, and looks pretty much exactly like the definition:
import struct
def wrap_tls_record(ctype, rdata):
data = bytes()
data += struct.pack(">B", ctype) # content type encoded as a single byte
data += b"\x03\x03" # legacy_record_version, should just be 0x0303
data += struct.pack(">H", len(rdata)) # length of the data
data += rdata # finally, the record data itself
return data
### Handshake Layer
But before we can send the first message, we also have to write some glue code for the handshake layer! This layer describes all handshake messages, and can be found in appendix B.3 of the RFC.
Again, not too much code, just needs to be there. The annoying part of this is that the length is actually described with a `uint24`, which means it takes 3 bytes. Python's `struct` module doesn't actually have anything for this, so I'm just going to use the 4-byte unsigned option and chop off the first byte (remember, we are using big-endian encoding, so the MSB is the extra one).
import struct
def wrap_handshake(htype, hdata):
data = bytes()
data += struct.pack(">B", htype) # handshake type encoded as a byte
data += struct.pack(">I", len(hdata))[1:] # length, encoded as 3 bytes!
data += hdata # and then the handshake data
return data
### Client Hello
TLS starts with the client sending a `ClientHello` message (defined in section 4.1.2 of the RFC), which basically starts the handshake off with some basic details about what the client can do. Now's probably a good time to decide on some basics, like which ciphers we'll be using to communicate.
#### Cipher Suite
In reality, encryption is mostly done at the hardware level, so browsers choose this based on what algorithms your hardware is fastest at. I pointed Firefox at Wikipedia and peeked into the connection details and it looks like I'm using AES-256-GCM with SHA-384, so I'll go with that. Let's see what byte sequence corresponds to these ciphers.
This specification defines the following cipher suites for use with
TLS 1.3.
| Description | Value |
| TLS_AES_128_GCM_SHA256 | {0x13,0x01} |
| TLS_AES_256_GCM_SHA384 | {0x13,0x02} | <-- this one
| TLS_CHACHA20_POLY1305_SHA256 | {0x13,0x03} |
| TLS_AES_128_CCM_SHA256 | {0x13,0x04} |
| TLS_AES_128_CCM_8_SHA256 | {0x13,0x05} |
Cool, this means the two numbers `0x13` and `0x02` correspond to the cipher suite we want to use.
#### Extensions
Ridiculously enough, it seems that TLS 1.3 keeps a lot of pre-1.3 fields in there, renaming them `legacy_`, and then putting new features in extensions. This may help forward compatibility, but also means that some extensions end up not being extensions at all, but required components of the protocol. (I suppose this helps them phase out certain headers in later updates without changing the general layout)
The extensions we'll need to support are listed in section 9.2 of the RFC. We'll only be sending the ones required during a `ClientHello`:
- supported_versions (required)
- signature_algorithms (required)
- key_share (required)
- server_name (required)
- application_layer_protocol_negotiation
What this means for our implementation is that for each of these we'll have to send a bit of information in the `ClientHello`. That's not too big of a deal; let's go through them one-by-one.
(Before I start, I have to warn you; there are a LOT of length-wrappers. Most of these seem unnecessary since we're using Python, but I expect these were designed with generalization in mind)
Supported versions is just what TLS 1.3 replaced the version header with; rather than saying up front that I want TLS 1.2, we have a general TLS framework for specifying extensions and then if I want to let the server know I can speak both TLS 1.2 and TLS 1.3, I'd put both versions into this extension.
def ext_supported_versions():
versions = [b"\x03\x04"] # code number for TLS 1.3
versions = b"".join(map(lambda p: struct.pack(">B", len(p)) + p, versions))
return (struct.pack(">H", 43) # code number for supported_versions
+ struct.pack(">H", len(versions))
+ versions)
In TLS, clients have a pre-defined set of root authorities that it trusts, distributed by some trusted party like the OS or browser developers. These root authorities can then sign certificates for individual sites to prove to clients that they hold ownership over that domain. Clients can verify this proof cryptographically, using one of the signature algorithms we're going to to negotiate.
For this I looked at some of the ciphers my browser supports, and just picked one that seems to have wide support: `ecdsa_secp256r1_sha256`.
def ext_signature_algorithms():
algos = [b"\x04\x03"] # ecdsa_secp256r1_sha256
sig_algos = b"".join(algos)
ext = struct.pack(">H", len(sig_algos)) + sig_algos # yeah...
return (struct.pack(">H", 13) # code number for signature_algorithms
+ struct.pack(">H", len(ext))
+ ext)
Key negotiation is an important step, letting us establish a shared secret between the client and server without explicitly sending it over the network. Typically for this step, a form of Diffie-Hellman Exchange is performed, but pre-sharing a symmetric key is also used.
Here we'll need to step a bit into the crypto. I'm going to choose elliptical-curve Diffie-Hellman ephemeral (ECDHE), which uses the elliptical curve operation to obscure keys as opposed to the original Diffie-Hellman which uses modular exponentiation. Cloudflare's blog has a [good introduction to elliptical curves][6].
What this means for us is we need to pick parameters for initiating this exchange. First we'll pick a named group in the `supported_groups` extension, then we'll have to send the parameters for that particular group in the `key_share` extension. I'm going to pick secp256r1, the same algorithm as the one above, so I only need to implement one algorithm.
def ext_supported_groups():
groups = [23] # secp256r1
groups = b"".join(map(lambda g: struct.pack(">H", g), groups))
ext = struct.pack(">H", len(groups)) + groups # yeah...
return (struct.pack(">H", 10) # code number for alpn
+ struct.pack(">H", len(ext))
+ ext)
def ext_key_share():
# hardcoding a fixed value here for now, we'll generate it later!
import binascii
x = b"\xf5\xddoi\xc8\x8c/#\x99\x8a\xaef\x8aWx\xacW,\xbad\x8d\x04\xac\x10\x05\xc2\x8f\x9bJ\x18\xf8."
y = b"\xfc}\x7f\xe0\x89\xb2YF\x0b\xc6\xb7\x00@\x04\xf6\x17Vl)V+\x18\xae\x157:o\xcc\x91\xf9\xaa#"
kex = b"\x04" + x + y
key_share = struct.pack(">H", 23) + struct.pack(">H", len(kex)) + kex
ext = struct.pack(">H", len(key_share)) + key_share
return (struct.pack(">H", 51) # code number for alpn
+ struct.pack(">H", len(ext))
+ ext)
Server name just lets the client tell the server what hostname it's expecting to connect to. The actual struct definitions here seem a bit over-the-top, but it's all in the name of future-proofing, right...?
def ext_server_name(hostname: str):
sname = b"\x00" # code for hostname
sname += struct.pack(">H", len(hostname)) # length of hostname
sname += hostname.encode("utf-8")
ext = struct.pack(">H", len(sname)) + sname # yeah...
return (struct.pack(">H", 0) # code number for server_name
+ struct.pack(">H", len(ext))
+ ext)
Application layer protocol negotiation (ALPN) isn't technically required, but we'll put it there to force the server to send us HTTP2. The extension contents are just the list of names concatenated together.
def ext_alpn():
protocols = [b"h2"] # http2, could also add http/1.1
alpn = b"".join(map(lambda p: struct.pack(">B", len(p)) + p, protocols))
ext = struct.pack(">H", len(alpn)) + alpn # yeah...
return (struct.pack(">H", 16) # code number for alpn
+ struct.pack(">H", len(ext))
+ ext)
Finally, let's combine all the functions above.
def client_hello_extensions(hostname: str):
return b"".join([
Extensions is the last piece of information we need to create the entire `ClientHello` message. Soon we'll be able to get the server to respond to us!
#### Putting the ClientHello message together
import os
def client_hello(hostname: str):
data = bytes()
data += struct.pack(">H", 0x0303) # legacy version
data += os.urandom(32) # 32 bytes nonce generated from /dev/urandom
data += b"\x00" # won't be using legacy_session_id, so send a zero
data += (b"\x00\x02" # we are sending 2 cipher suites
+ b"\x13\x02") # the number for TLS_AES_256_GCM_SHA384
data += b"\x01\x00" # legacy_compression_methods
ext = client_hello_extensions(hostname)
data += struct.pack(">H", len(ext))
data += ext
return data
Let's send something to a server and see if that's what we want!
import socket
def test_client_hello():
hostname = ""
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((hostname, 443))
# send client hello
ch = wrap_tls_record(22, wrap_handshake(1, client_hello(hostname)))
return s.recv(1024)
server_response = test_client_hello()
# b'\x16\x03\x03...'
assert server_response[0] == 0x16
If all went well, you should've received something with `\x16` as the first byte. That means the server sent a record with the content type `handshake(22)`. If you got `\x15`, it means you got an alert. In the next section we'll see how to interpret the server's response.
### Server Hello
The server hello is the response, where it tells us which ciphers and algorithms it chose, out of the ones we suggested. The code will look backwards from what we did above; instead of encoding a bunch of values, we'll read from what the server gives to us and interpret it instead.
#### Change Cipher Spec
Along with the Server Hello we'll also get a Change Cipher Spec record. According to the RFC, this is only in there for compatibility purposes, so we can safely ignore the one sent to us, but we'll also have to send a dummy Change Cipher Spec record as well.
def change_cipher_spec():
return (b"\x14" # code number for change cipher spec
+ b"\x03\x03" # legacy protocol version
+ struct.pack(">H", 1) # length of change cipher spec message
+ b"\x01")
Piece of cake.
### Crypto
> This is the deep-dive into the cryptographic portions of the protocol. If you're not too interested by this part, just continue on to the HTTP section.
Let's walk through each of the ciphers and algorithms we're going to need one more time:
- `ecdsa_secp256r1_sha256`
+ ECDSA is the elliptical-curve signature algorithm; basically it can sign some information using the elliptical-curve private key, and anyone can verify using the corresponding public key that the person who owns the key has created that signature.
+ secp256r1 just gives the name of a set of established parameters for a curve.
+ SHA256 is a hashing algorithm, which creates a unique fingerprint of a piece of information that can't be reversed back to the original. Python's `hashlib` library provides this function for us, so we don't have to implement it ourselves.
#### Naive Elliptical Curve Implementation
#### secp256r1
The curve is defined using the equation `y^2 = x^3 + ax + b mod p`.
import secrets
def ecdsa_keypair():
d = secrets.randbits(32)
Q = secp256r1.mul(secp256r1.G, d)
return (d, Q)
(d1, Q1) = ecdsa_keypair()
print("gen", d1, Q1)
def ecdsa_sign(d, z):
while True:
# generate a number k between 1 and n-1
k = secrets.randbelow(secp256r1.n - 1)
if k == 0: continue
p = secp256r1.mul(secp256r1.G, k)
r = p.x % secp256r1.n
if r == 0: continue
s = (pow(k, -1, secp256r1.n) * (z + r * d)) % secp256r1.n
if s == 0: continue
return (r, s)
(r1, s1) = ecdsa_sign(d1, 12345)
print("sign", r1, s1)
def ecdsa_verify(r, s, Q, z):
if not (r >= 1 and r < secp256r1.n and s >= 1 and s < secp256r1.n):
return False
sinv = pow(s, -1, secp256r1.n)
u1 = (z * sinv) % secp256r1.n
u2 = (r * sinv) % secp256r1.n
p = secp256r1.add(secp256r1.mul(secp256r1.G, u1), secp256r1.mul(Q, u2))
print(p.x % secp256r1.n)
if r != p.x % secp256r1.n: return False
return True
res = ecdsa_verify(r1, s1, Q1, 12345)
print("res", res)
### Encrypted tunnel
Now we should be ready to communicate with the server through our encrypted tunnel. But we forgot to keep around our key negotiation parameters! How will we encrypt our communication? Let's go back and update these functions to let us keep the parameters, using the crypto functions we just defined.
The key sharing function:
def ext_key_share(Q):
kex = b"\x04" + Q.x + Q.y
key_share = struct.pack(">H", 23) + struct.pack(">H", len(kex)) + kex
ext = struct.pack(">H", len(key_share)) + key_share
return (struct.pack(">H", 51) # code number for alpn
+ struct.pack(">H", len(ext))
+ ext)
Finally, the new `client_hello_extensions`:
def client_hello_extensions(hostname: str):
d, Q = ecdsa_keypair()
data = b"".join([
return (data, d, Q)
## HTTP 2
## `request`-like API
## Conclusion
What did we learn? Don't do this shit yourself, it's not worth it. We'll probably be on HTTP3 within the next year. Just import `requests` and be done with it.

@ -1,212 +0,0 @@
title = "End-to-end encryption is useless without client freedom"
date = 2021-10-31
tags = ["computers", "privacy"]
Today, many companies claim to provide "end-to-end encryption" of user data,
whether it be text messages, saved pictures, or important documents. But what
does this actually mean for your data? I'll explain what "non-end-to-end"
encryption is, why end-to-end encryption is important, and also when it might
be absolutely meaningless.<!--more-->
> If you just want to read about end-to-end encryption, click [here][1].
> Otherwise, I'll start the story all the way back to how computers talk to
> each other.
A game of telephone in a noisy room
Computer networks essentially operate like a bunch of people yelling at each
other at a public gathering, where everyone is kind of hearing everyone else's
messages, but only really paying attention to ones addressed to them. Let's say
I wanted to grab some Chipotle with my roommate. I'd yell "HEY NATHAN, WANNA GET
CHIPOTLE?" over this public network, where Nathan would see his name, identify
it as a message that is intended for him, and then reply accordingly. Notably,
everyone _else_ listening to the network also hears this, and knows that I'm
itching to get some Mexican food.
That's not even the worst part, because well... how does your computer connect
to the internet? Your router hears your computer's message, and passes it
through a series of middlemen, who all perform this broadcasting ritual through
some local network, until it gets to wherever your computer wanted to talk to in
the first place. But in order for the middlemen to pass on the message, they'd
have to hear the message, so now my lunch has become a public gathering known to
everyone who's heard or passed on the message.
Encryption saves the day
That's where **encryption** comes in. Encryption lets me change the message to
something that the middlemen and everyone else listening on the network can't
understand, but the person that I actually wanted to send the message can turn
it back into the original. This way, I can be sure no one except Nathan got the
memo of where we were grabbing lunch.[^3]
So the way encryption's being used here is known as _transport_ encryption,
since I'm _sending_ a message somewhere. Transport encryption is standard
practice now through a technology called **transport-layer security**, or TLS,
which is used by almost everything that talks to the internet, your browser,
your email client, your phone. If it's not using TLS, it should be considered
If you're thinking ahead, you might be thinking that the other place encryption
can be used is **encryption at rest**. This is for documents and pictures that
need to sit somewhere in storage for a while but shouldn't be visible to
everyone. Many businesses require that their employees' laptops use _full-disk_
encryption, so their data doesn't get compromised.
When you put these together, your data is actually pretty safe from prying
hands. If I put some tax documents on Google Drive, it'll use _transport_
encryption to make sure no one steals my identity while I'm sending it, and
encryption _at rest_ to make sure someone breaking into Google won't be able to
just pull the hard drive out and read the files off it.
Two halves don't equal a whole
It turns out just putting together these two types of encryption isn't enough.
There's someone else we haven't protected ourselves against in this case, which
is the party responsible for decrypting the transported data and then
re-encrypting it at rest. Google can read all the documents I upload to Drive
after decrypting it from transit and before encrypting it to disk. Facebook can
read all the messages I send to my friends after decrypting it from transit and
before re-encrypting it to send to my friends.
And this is a lot smaller of a problem than it was before! Companies usually
have privacy policies to protect user data from being used against what they
expect, and many industries have laws like [HIPAA][hipaa] and [FERPA][ferpa] to
make sure the people handling your data don't leak it.
But we don't have to just _trust_ them on that, because we already _know_ how
to protect data from middlemen who are simply taking a message and sending it
somewhere else unchanged, like the ISPs from our networks. We just need _more_
**End-to-end encryption** is just encrypting the data in a way that the only
parties allowed to read the data are the people it was intended for. Password
manager services like 1Password and Bitwarden use end-to-end encryption so that
they're not decrypting your passwords when you store them online, they're just
storing the encrypted data as-is, and then handing it back to your device which
then decrypts it offline. [Signal][signal] famously provides end-to-end
encrypted chat, so that no one, not even the government[^1], will be able to
read the messages you send if they're not the intended recipient.
It's still not enough {#not-enough}
End-to-end encryption seems like it should be the end of the story, but if
there's one thing that can undermine the encryption, it's the program that's
actually performing the encryption. Cryptographic operations are usually handled
by clients, but unless you want to sit there adding points on an elliptic curve
in a finite field, that client is your device or your browser, not you.
The big problem here is how do you know your device is actually performing the
encryption? How do you know the apps on your phone are only sending the data it
needs to send, and not a lot more? Traditionally, independent researchers or
bounty hunters may reverse-engineer client software and discover that they
didn't quite operate as advertised, but we can't just rely solely on people
from reddit with too much time on their hands to uphold security.
Imagine if Google Drive was actually a physical vault service and the website
was just a person you would hand your valuables to to keep safe. They could say
"we're keeping this in military-grade security," but unless you watched what
they did, how do you know they didn't cheap out on you and just shove it under
the mattress where hackers breaking in could just steal everything?
Same applies to Apple's recent child protection system. Their [white
paper][csam] goes in painstakingly great detail about how photos are protected
by "multi-layer" encryption before it's able to be decrypted by Apple. But
typical users are not allowed to pick apart your iPhone to make sure it's
encrypting everything correctly, or that the perceptual hashing algorithm it
uses to filter pictures isn't just trivially flagging everything for manual
WhatsApp data is stored unencrypted to the running application in order to
store a database of messages locally. Additionally, this database can be backed
up to iCloud, and according to [WhatsApp themselves][whatsapp], that data is
stored unencrypted, which means that it may benefit from _transport_ security
and _encryption at rest_ independently, but ultimately the people moving data
around are still able to read it.
I've also seen discussion of undermining end-to-end encryption in a [ghost
proposal][ghost], a method that abuses multi-party encryption to add in a
"ghost" listener, which can be the company or the government or anyone else
that the vendor chooses. In theory, this backdoor could be prevented by an
open-sourced client that properly checks each recipient to make sure it's the
expected person before encrypting the message and sending it.
Given that end-to-end encryption solely exists because trusting companies that
run services is insufficient, it's safe to say that trusting companies to make
client software that act in the interest of their users is just as useless as
trusting companies to make services that act in the interest of their users.
What can i do?
Although inconvenient, trusting different vendors for different pieces of this
technological assembly line is the best way to prevent it from becoming abused.
Many software use **open protocols**, communication schemes that are agreed
upon and freely available to everyone[^2]. Then, independent parties develop
and maintain lots of different software that all speak the same protocol, so if
you don't trust a particular service to have an app that doesn't encrypt its
data properly, you can just choose to use a different one by someone who you
trust more.
**Email** is a famous case of this: if I sign up for an email account with
Outlook, I don't have to use a proprietary Outlook client. I _could_ if I
wanted, and I imagine that there may be some features that Microsoft has added
specifically to the Outlook website and apps, but since they claim to conform to
the _open_ email specifications, I can just choose to use a different one.
On top of that, email is _federated_, which means that if I didn't like
Outlook's services, I could switch to a different provider and _still_ be able
to chat with people on Outlook, unlike many of today's siloed services where I
can't just message people on Facebook if I only have an account on Twitter,
since they don't talk to each other using the same protocol.
[**Matrix**][matrix] is a new chat network that also follows in the same spirit
as email, but also has the benefits of multi-party encryption. There are
multiple apps and servers, and servers can federate with each other using an
open protocol. I would strongly recommend people who are interested in privacy
to consider it.
Why care? This might just seem to be some superficial political concern by
privacy advocates who warn of dangerous edge cases that only matter to people
whose rights are being violated by some dystopian government. Well, to put it
bluntly, that dystopia is now, and it's not just the government we should be
afraid of, but tech megacorps who possibly have even more power.
We live in a digital world, so it's important to know how it works and who's in
[^1]: Governments and other parties with enough computational resources may
still be able to undermine specific levels of security, or just [threaten you
personally][wrench] until they get what they want.
[^2]: Large corporations typically still have majority representation in the
committees that decide on the most impactful specifications, but these are
