Extracting GUIDs from a bunch of text

As part of messing with Windows Performance Recorder profiles, I had a need to extract everything that looks like a GUID from a blob of text.

So I wrote this script. It digs through its STDIN and outputs all the GUID-looking things, uppercase-ized and sorted.

https://github.com/mvaneerde/blog/blob/master/scripts/extract-guids.pl

Perl scripts to encrypting and decrypt text using Rijndael

I talked about Rijndael in a few previous posts: Expressing a function f: GF(2⁸) → GF(2⁸) as a polynomial using a Lagrange polynomial, Generating the Rijndael S-box, Efficient multiplication and division in GF(28), Sieving irreducible monic polynomials over a finite field, Addition and multiplication table for GF(22).

Today I wrote a couple of Perl scripts which use off-the-shelf CPAN modules to encrypt and decrypt text using a passphrase. For ease of transport, at the cost of some storage space, the encrypted payload is Base64 encoded.

The particular CPAN modules in question are Crypt::CBC and Crypt::Rijndael.

encrypt.pl and decrypt.pl source in my GitHub repository. They operate on STDIN and produce output on STDOUT.

Note that if you pass an incorrect passphrase to decrypt.pl you get garbage output, rather than an error.

Here is a secret message which is encoded with the passphrase “hunter2”: encrypted.txt Right-click the link and save, rather than opening it in the browser.

Sieving irreducible monic polynomials over a finite field

Last time we talked about the addition and multiplication tables for GF(2²); GF(28) is relevant for the Rijndael encryption scheme.

Joan Daemen and Vincent Rijmen use a representation of GF(28) where each element is a polynomial of the form b7x7 + b6x6 + b5x5 + b4x4 + b3x3 + b2x2 + b1x + b0, with the bi being bits. A polynomial is represented by a single byte. Addition is component-wise (modulo 2); two polynomials can be added with a single XOR operation. Multiplication is a little more complicated.

Rijndael defines multiplication as normal polynomial multiplication, followed by taking a modulus using the reduction polynomial mx8 + x4 + x3 + x + 1. Last time we showed that, in GF(2²), x2 + x + 1 was used to reduce polynomial multiplication, and also that a necessary quality for a reduction polynomial is that it be irreducible/prime.

Last time I hinted at the question: how do we show that m is irreducible? Well, one way is to do trial division of all polynomials up to degree 4.

But that’s no fun. Instead, let’s write a script to generate all irreducible polynomials, and see if m is on it! The approach is very similar to the Sieve of Eratosthenes: generate a list of all the polynomials, then traverse it from the low end to the high end; circle each element that hasn’t been crossed out, then cross out all multiples of that element.

The first argument q is the modulus of the coefficients (in this case, 2) and the second argument d (in this case, 8) is how high the degree can go. Here is the output of the script:

>perl polynomial-sieve.pl 2 8
Finding monic irreducible polynomials of degree up to 8 and coefficients mod 2
Generating all monic polynomials…
Sieving out all reducible polynomials…
1
x
x + 1
x^2 + x + 1
x^3 + x + 1
x^3 + x^2 + 1
x^4 + x + 1
x^4 + x^3 + 1
x^4 + x^3 + x^2 + x + 1
x^5 + x^2 + 1
x^5 + x^3 + 1
x^5 + x^3 + x^2 + x + 1
x^5 + x^4 + x^2 + x + 1
x^5 + x^4 + x^3 + x + 1
x^5 + x^4 + x^3 + x^2 + 1
x^6 + x + 1
x^6 + x^3 + 1
x^6 + x^4 + x^2 + x + 1
x^6 + x^4 + x^3 + x + 1
x^6 + x^5 + 1
x^6 + x^5 + x^2 + x + 1
x^6 + x^5 + x^3 + x^2 + 1
x^6 + x^5 + x^4 + x + 1
x^6 + x^5 + x^4 + x^2 + 1
x^7 + x + 1
x^7 + x^3 + 1
x^7 + x^3 + x^2 + x + 1
x^7 + x^4 + 1
x^7 + x^4 + x^3 + x^2 + 1
x^7 + x^5 + x^2 + x + 1
x^7 + x^5 + x^3 + x + 1
x^7 + x^5 + x^4 + x^3 + 1
x^7 + x^5 + x^4 + x^3 + x^2 + x + 1
x^7 + x^6 + 1
x^7 + x^6 + x^3 + x + 1
x^7 + x^6 + x^4 + x + 1
x^7 + x^6 + x^4 + x^2 + 1
x^7 + x^6 + x^5 + x^2 + 1
x^7 + x^6 + x^5 + x^3 + x^2 + x + 1
x^7 + x^6 + x^5 + x^4 + 1
x^7 + x^6 + x^5 + x^4 + x^2 + x + 1
x^7 + x^6 + x^5 + x^4 + x^3 + x^2 + 1
x^8 + x^4 + x^3 + x + 1
x^8 + x^4 + x^3 + x^2 + 1
x^8 + x^5 + x^3 + x + 1
x^8 + x^5 + x^3 + x^2 + 1
x^8 + x^5 + x^4 + x^3 + 1
x^8 + x^5 + x^4 + x^3 + x^2 + x + 1
x^8 + x^6 + x^3 + x^2 + 1
x^8 + x^6 + x^4 + x^3 + x^2 + x + 1
x^8 + x^6 + x^5 + x + 1
x^8 + x^6 + x^5 + x^2 + 1
x^8 + x^6 + x^5 + x^3 + 1
x^8 + x^6 + x^5 + x^4 + 1
x^8 + x^6 + x^5 + x^4 + x^2 + x + 1
x^8 + x^6 + x^5 + x^4 + x^3 + x + 1
x^8 + x^7 + x^2 + x + 1
x^8 + x^7 + x^3 + x + 1
x^8 + x^7 + x^3 + x^2 + 1
x^8 + x^7 + x^4 + x^3 + x^2 + x + 1
x^8 + x^7 + x^5 + x + 1
x^8 + x^7 + x^5 + x^3 + 1
x^8 + x^7 + x^5 + x^4 + 1
x^8 + x^7 + x^5 + x^4 + x^3 + x^2 + 1
x^8 + x^7 + x^6 + x + 1
x^8 + x^7 + x^6 + x^3 + x^2 + x + 1
x^8 + x^7 + x^6 + x^4 + x^2 + x + 1
x^8 + x^7 + x^6 + x^4 + x^3 + x^2 + 1
x^8 + x^7 + x^6 + x^5 + x^2 + x + 1
x^8 + x^7 + x^6 + x^5 + x^4 + x + 1
x^8 + x^7 + x^6 + x^5 + x^4 + x^2 + 1
x^8 + x^7 + x^6 + x^5 + x^4 + x^3 + 1
Done!

Note that m is not just prime, it is the lowest of the monic 8-degree polynomials with coefficients mod 2.

The script itself (in Perl) is attached. It makes no claims to being pretty or efficient.

As a check, there is an elegant formula for the number of irreducible monic polynomials with coefficients in a finite field:

N(q, n) = (Σd|n μ(d) qn/d)/n

where μ(x) is the Möbius function.

In particular:

N(2, 1) = ((1)21/1)/1 = 2
N(2, 2) = ((1)22/1 – (1)22/2)/2 = 1
N(2, 3) = ((1)23/1 – (1)23/3)/3 = 2
N(2, 4) = ((1)24/1 – (1)24/2 + (0)24/4)/4 = 3
N(2, 5) = ((1)25/1 – (1)25/5)/5 = 6
N(2, 6) = ((1)26/1 – (1)26/2 – (1)26/3 + (1)26/6)/6 = 9
N(2, 7) = ((1)27/1 – (1)27/7)/7 = 18
N(2, 8) = ((1)28/1 – (1)28/2 + (0)28/4 + (0)28/8)/8 = 30

This checks out with the script.

EDIT 2015-10-31: move script to https://github.com/mvaneerde/blog/blob/master/rijndael/polynomial-sieve.pl

Perl script to parse MPEG audio header

I’ve written a perl script (attached) which will parse MPEG audio headers and display them in a human-readable format.

For example, if you run it on ding.mpeg (also attached) you get this output:

>perl mpegaudioheader.pl ding.mpeg
Frame header: 11111111 111 (should be all ones)
MPEG Audio version ID: 11 (MPEG version 1 (ISO/IEC 11172-3))
Layer description: 10 (layer II)
Protection bit: 0 (protected by CRC (16-bit CRC follows header))
Bitrate index: 1011 (224 kbps)
Sample rate index: 00 (44100 Hz)
Padding bit: 0 (frame is not padded)
Private bit: 0 (application specific)
Channel mode: 00 (stereo)
Mode extension (if channel mode is joint stereo:) 00 (bands 4 to 31)
Copyright: 0 (audio is not copyrighted)
Original: 0 (copy of original media)
Emphasis: 00 (none)

Here’s the source for the perl script:

use strict;

# based on http://www.mpgedit.org/mpgedit/mpeg_format/mpeghdr.htm
# assumes that the file you point it at starts with an MPEG audio header

unless (1 == @ARGV and $ARGV[0] ne "/?" and $ARGV[0] ne "-?") {
    print "USAGE: perl $0 mpeg-audio-file.mpeg";
    exit(0);
};

my %version = (
    "00" => "MPEG Version 2.5 (unofficial)",
    "01" => "reserved",
    "10" => "MPEG version 2 (ISO/IEC 13818-3)",
    "11" => "MPEG version 1 (ISO/IEC 11172-3)",
);

my %layer = (
    "00" => "reserved",
    "01" => "layer III",
    "10" => "layer II",
    "11" => "layer I",
);

my %protection = (
    "0" => "protected by CRC (16-bit CRC follows header)",
    "1" => "not protected",
);

my %bitrate = (
    # version 1
    "11" => {
        # layer 1
        "11" => {
            "0000" => "free",
            "0001" => "32 kbps",
            "0010" => "64 kbps",
            "0011" => "96 kbps",
            "0100" => "128 kbps",
            "0101" => "160 kbps",
            "0110" => "192 kbps",
            "0111" => "224 kbps",
            "1000" => "256 kbps",
            "1001" => "288 kbps",
            "1010" => "320 kbps",
            "1011" => "352 kbps",
            "1100" => "384 kbps",
            "1101" => "416 kbps",
            "1110" => "448 kbps",
            "1111" => "bad",
        },

        # layer 2
        "10" => {
            "0000" => "free",
            "0001" => "32 kbps",
            "0010" => "48 kbps",
            "0011" => "56 kbps",
            "0100" => "64 kbps",
            "0101" => "80 kbps",
            "0110" => "96 kbps",
            "0111" => "112 kbps",
            "1000" => "128 kbps",
            "1001" => "160 kbps",
            "1010" => "192 kbps",
            "1011" => "224 kbps",
            "1100" => "256 kbps",
            "1101" => "320 kbps",
            "1110" => "384 kbps",
            "1111" => "bad",
        },

        # layer 3
        "01" => {
            "0000" => "free",
            "0001" => "32 kbps",
            "0010" => "40 kbps",
            "0011" => "48 kbps",
            "0100" => "56 kbps",
            "0101" => "64 kbps",
            "0110" => "80 kbps",
            "0111" => "96 kbps",
            "1000" => "112 kbps",
            "1001" => "128 kbps",
            "1010" => "160 kbps",
            "1011" => "192 kbps",
            "1100" => "224 kbps",
            "1101" => "256 kbps",
            "1110" => "320 kbps",
            "1111" => "bad",
        },
    },

    # version 2
    "10" => {
        # layer 1
        "11" => {
            "0000" => "free",
            "0001" => "32 kbps",
            "0010" => "48 kbps",
            "0011" => "56 kbps",
            "0100" => "64 kbps",
            "0101" => "80 kbps",
            "0110" => "96 kbps",
            "0111" => "112 kbps",
            "1000" => "128 kbps",
            "1001" => "144 kbps",
            "1010" => "160 kbps",
            "1011" => "176 kbps",
            "1100" => "192 kbps",
            "1101" => "224 kbps",
            "1110" => "256 kbps",
            "1111" => "bad",
        },

        # layer 2
        "10" => {
            "0000" => "free",
            "0001" => "8 kbps",
            "0010" => "16 kbps",
            "0011" => "24 kbps",
            "0100" => "32 kbps",
            "0101" => "40 kbps",
            "0110" => "48 kbps",
            "0111" => "56 kbps",
            "1000" => "64 kbps",
            "1001" => "80 kbps",
            "1010" => "96 kbps",
            "1011" => "112 kbps",
            "1100" => "128 kbps",
            "1101" => "144 kbps",
            "1110" => "160 kbps",
            "1111" => "bad",
        },

        # layer 3
        "01" => {
            "0000" => "free",
            "0001" => "8 kbps",
            "0010" => "16 kbps",
            "0011" => "24 kbps",
            "0100" => "32 kbps",
            "0101" => "40 kbps",
            "0110" => "48 kbps",
            "0111" => "56 kbps",
            "1000" => "64 kbps",
            "1001" => "80 kbps",
            "1010" => "96 kbps",
            "1011" => "112 kbps",
            "1100" => "128 kbps",
            "1101" => "144 kbps",
            "1110" => "160 kbps",
            "1111" => "bad",
        },
    },
);

my %samplerate = (
    # version 1
    "11" => {
        "00" => "44100 Hz",
        "01" => "48000 Hz",
        "10" => "32000 Hz",
        "11" => "reserved",
    },

    # version 2
    "10" => {
        "00" => "22050 Hz",
        "01" => "24000 Hz",
        "10" => "16000 Hz",
        "11" => "reserved",
    },

    # version 2.5 (unofficial)
    "00" => {
        "00" => "11025 Hz",
        "01" => "12000 Hz",
        "10" => "8000 Hz",
        "11" => "reserved",
    },
);

my %padding = (
    "0" => "frame is not padded",
    "1" => "frame is padded with one extra slot",
);

my %channelmode = (
    "00" => "stereo",
    "01" => "joint stereo (stereo)",
    "10" => "dual channel (stereo)",
    "11" => "single channel (mono)",
);

my %modeextension = (
    # layer I
    "11" => {
        "00" => "bands 4 to 31",
        "01" => "bands 8 to 31",
        "10" => "bands 12 to 31",
        "11" => "bands 16 to 31",
    },

    # layer II
    "10" => {
        "00" => "bands 4 to 31",
        "01" => "bands 8 to 31",
        "10" => "bands 12 to 31",
        "11" => "bands 16 to 31",
    },

    # layer III
    "01" => {
        "00" => "intensity stereo off; m/s stereo off",
        "01" => "intensity stereo on; m/s stereo off",
        "10" => "intensity stereo off; m/s stereo on",
        "11" => "intensity stereo on; m/s stereo on",
    },
);

my %copyright = (
    "0" => "audio is not copyrighted",
    "1" => "audio is copyrighted",
);

my %original = (
    "0" => "copy of original media",
    "1" => "original media",
);

my %emphasis = (
    "00" => "none",
    "01" => "50/15 microseconds", # the source incorrectly says "ms" which is milliseconds
    "10" => "reserved",
    "11" => "CCIT J.17",
);

open(MPEG, "<", $ARGV[0]) or die("Could not open $ARGV[0]: $!");
binmode(MPEG) or die("Could not set file handle to binary mode: $!"); # binary file

my $header = "";

my $header_size = 16;
my $read = read(MPEG, $header, $header_size, 0);

close(MPEG);

$header_size == $read or die("Expected $header_size bytes to be read, not $read");

my @bits = ();

for my $byte (map { ord( $_ ) } split (//, $header)) {
    for my $bit (1 .. 8) {
        push @bits, (($byte & (1 << (8 - $bit))) ? 1 : 0);
    }
}

unless ("1" x 11 eq join("", @bits[0 .. 10])) {
    printf("WARNING: the frame header is not all ones. This is not a valid MPEG audio header.n");
    # carry on regardless
}

printf(
    "Frame header: %s %s (%s)\n" .
    "MPEG Audio version ID: %s (%s)\n" .
    "Layer description: %s (%s)\n" .
    "Protection bit: %s (%s)\n" .
    "Bitrate index: %s (%s)\n" .
    "Sample rate index: %s (%s)\n" .
    "Padding bit: %s (%s)\n" .
    "Private bit: %s (%s)\n" .
    "Channel mode: %s (%s)\n" .
    "Mode extension (if channel mode is joint stereo:) %s (%s)\n" .
    "Copyright: %s (%s)\n" .
    "Original: %s (%s)\n" .
    "Emphasis: %s (%s)\n" .
    ""
    ,
    join("", @bits[0 .. 7]), join("", @bits[8 .. 10]), "should be all ones",
    join("", @bits[11 .. 12]), $version{ join("", @bits[11 .. 12]) },
    join("", @bits[13 .. 14]), $layer{ join("", @bits[13 .. 14]) },
    $bits[15], $protection{ $bits[15] },
    join("", @bits[16 .. 19]),
        # bit rate depends on version, layer, and bitrate index
        $bitrate{ join("", @bits[11 .. 12]) }{ join("", @bits[13 .. 14]) }{ join("", @bits[16 .. 19]) },
    join("", @bits[20 .. 21]),
        # sample rate depends on version
        $samplerate{ join("", @bits[11 .. 12]) }{ join("", @bits[20 .. 21]) },
    $bits[22], $padding{ $bits[22] },
    $bits[23], "application specific",
    join("", @bits[24 .. 25]), $channelmode{ join("", @bits[24 .. 25]) },
    join("", @bits[26 .. 27]),
        # mode extension depends on layer
        $modeextension{ join("", @bits[13 .. 14]) }{ join("", @bits[26 .. 27]) },
    $bits[28], $copyright{ $bits[28] },
    $bits[29], $original{ $bits[29] },
    join("", @bits[30 .. 31]), $emphasis{ join("", @bits[30 .. 31]) },
);

Note this script assumes that the very first bytes of the file are the MPEG audio header, and makes no effort to dig into the file to find the audio header.

EDIT 2015-10-31: moved script to https://github.com/mvaneerde/blog/blob/master/scripts/mpegaudioheader.pl

Bad Perl: locker problem

Bad Perl solution to the “print the open lockers” problem:

perl -e"print join', ',map{$_*$_}1..sqrt pop" 100

54 characters.  I prefer this to the 53-character solution obtained by omitting the space after the first comma.

EDIT: 49 characters:

perl -e"print map{$_*$_,' '}1..sqrt pop" 100

EDIT: 48:

perl -e"print map{$_*$_.$/}1..sqrt pop" 100

EDIT: 47:

perl -e"map{print$/.$_*$_}1..sqrt pop" 100

I still think “say” is cheating but it does afford this very short solution:

perl -E"map{say$_*$_}1..sqrt pop" 100

EDIT: Apparently I need to learn how to count. Counts above are off. Anyway, 41:

perl -e"print$_*$_,$/for 1..sqrt pop" 100

Bad Perl: Josephus problem

Another programming contest asks to solve the Josephus problem.

Bad Perl solution (83 characters… so close…)

>perl -e"@_=(1..$ARGV[0]);++$c%$ARGV[1]?$i++:splice@_,$i%=@_,1while$#_;print@_" 40 3
28

EDIT: got it down to 80.

>perl -e"@_=(1..shift);++$c%$ARGV[0]?$i++:splice@_,$i%=@_,1while$#_;print@_" 40 3
28

EDIT2: 78 dropping the parentheses.

>perl -e"@_=1..shift;++$c%$ARGV[0]?$i++:splice@_,$i%=@_,1while$#_;print@_" 40 3
28

EDIT3: 66, shamelessly cannibalizing others’ ideas from the contest (though I refuse to use “say”)

>perl -e"$k=pop;@_=1..pop;@_=grep{++$i%$k}@_ while$#_;print@_" 40 3
28