DGA

In this last prog challenge, connecting to the remote socket gave us something like :

There's a new threat actor in town, and he claims to be a "Binary Ninja who Loves DGAs". We found some domains generated by one of his D
GA's, can you figure out the domain for when we plan to take him down?

Here's the data that we got:

(372720574, '1981/10/23 21:29:34', 'http://F1Mg61r0CSi80.dgeh')
(313126529, '1979/12/04 03:35:29', 'http://T1B50r1GFe70.dgeh')
(178331573, '1975/08/27 00:32:53', 'http://W0Sc40t0AZn70.dgeh')
(698912711, '1992/02/24 06:25:11', 'http://M2K50y0GGa90.dgeh')
(619720075, '1989/08/21 16:27:55', 'http://M2FE0t0GTi80.dgeh')
(1030239575, '2002/08/25 01:39:35', 'http://S3FeD0t0TSc00.dgeh')
(859343878, '1997/03/26 02:37:58', 'http://W3C70h0GS90.dgeh')
(1354104669, '2012/11/28 12:11:09', 'http://W5H60r1ACu10.dgeh')
(1464927891, '2016/06/03 04:24:51', 'http://F5SiA0e0CFe10.dgeh')
(482830956, '1985/04/20 07:42:36', 'http://S1Mn81l0GF80.dgeh')
(993389152, '2001/06/24 13:25:52', 'http://S3Ti60e1GF00.dgeh')
(225334872, '1977/02/21 01:01:12', 'http://M0FeD0y0TMn70.dgeh')
(739185329, '1993/06/04 09:15:29', 'http://F2Cr11e0TNe90.dgeh')
(838733472, '1996/07/30 13:31:12', 'http://T3LiF0y0GFe90.dgeh')
(1253292205, '2009/09/18 16:43:25', 'http://F4Sc60r0AHe00.dgeh')
(145846128, '1974/08/16 00:48:48', 'http://F0Cl60t0GCu70.dgeh')
(1404216603, '2014/07/01 12:10:03', 'http://T5N60y0CCa10.dgeh')
(758254375, '1994/01/11 02:12:55', 'http://T2Fe60y0GMg90.dgeh')
(389576490, '1982/05/06 23:41:30', 'http://T1Si70y0TNi80.dgeh')
(475797590, '1985/01/28 21:59:50', 'http://M1CrB0y0AH80.dgeh')
(1015521178, '2002/03/07 17:12:58', 'http://T3Mn01h0AZn00.dgeh')
(253462045, '1978/01/12 14:07:25', 'http://T0Zn30y0CCr70.dgeh')
(712482026, '1992/07/30 07:40:26', 'http://T2CaE1y0AK90.dgeh')
(906854029, '1998/09/26 23:53:49', 'http://S3Mg10r0AFe90.dgeh')
(1038195175, '2002/11/25 03:32:55', 'http://M3CoC0r0AN00.dgeh')
(1204721504, '2008/03/05 12:51:44', 'http://W4P90h0TCu00.dgeh')
(207682179, '1976/07/31 17:29:39', 'http://S0CrC0y1ANe70.dgeh')
(1415218931, '2014/11/05 20:22:11', 'http://W5OB0r0CCo10.dgeh')
(537042487, '1987/01/07 18:28:07', 'http://W2Nu00y0ACr80.dgeh')
(247928040, '1977/11/09 12:54:00', 'http://W0Cu81r0TK70.dgeh')
(225702078, '1977/02/25 07:01:18', 'http://F0FeE0y1THe70.dgeh')
(194136270, '1976/02/25 22:44:30', 'http://W0V20y0GLi70.dgeh')
(922592181, '1999/03/28 03:36:21', 'http://S3AlF0h0CSi90.dgeh')
(141792109, '1974/06/30 02:41:49', 'http://S0SE0e0TAl70.dgeh')
(422634404, '1983/05/24 14:26:44', 'http://T1Ar60y1CSi80.dgeh')
(698102055, '1992/02/14 21:14:15', 'http://F2K30y0TBe90.dgeh')
(1494986000, '2017/05/17 01:53:20', 'http://W5Ar30y0GCa10.dgeh')
(1450347733, '2015/12/17 10:22:13', 'http://T5MgE0r0GK10.dgeh')
(979351457, '2001/01/13 02:04:17', 'http://S3CaB1y0TZn00.dgeh')
(1059599780, '2003/07/30 21:16:20', 'http://W3Zn50y0TTi00.dgeh')
(1406131521, '2014/07/23 16:05:21', 'http://W5N91y0ASc10.dgeh')
(560817170, '1987/10/09 22:32:50', 'http://F2HeD0r0CS80.dgeh')
(699951958, '1992/03/07 07:05:58', 'http://S2K70h0GAl90.dgeh')
(198635517, '1976/04/18 00:31:57', 'http://S0VA0l1GGa70.dgeh')
(206322537, '1976/07/15 23:48:57', 'http://T0Cr90y0AAl70.dgeh')
(458953987, '1984/07/17 23:13:07', 'http://T1TiB0y0TCa80.dgeh')
(526776429, '1986/09/10 22:47:09', 'http://W1ZnC0r1AH80.dgeh')
(1182461562, '2007/06/21 21:32:42', 'http://T4MgF0e1GMn00.dgeh')
(386596130, '1982/04/02 11:48:50', 'http://F1Si10l1ACa80.dgeh')
(937075529, '1999/09/11 18:45:29', 'http://S3PB0r0CAl90.dgeh')

But the last one we got was empty, can you solve it?
(1529078400, '2018/06/15 16:00:00', '?')

It didn’t took too long to understand that DGA stands for Domain Generation Algorithm. Basically, this is the name given to algorithm used to generate C&C domain name in a deterministic way.

So the goal is easy … to understand. We need to reverse the algorithm with provided samples to guess the last one.

After some time spent on the challenge, we see that :

Each of the “selector” I wrote does the same thing. It tries to check that it is the next pattern, then returns the next value for the url to find.
Each one takes 2 parameters :

Let’s dig through the various patterns.

First letter of the day

In the sample we see that first letter is one of S,F,T,W,M. Easy to understand that it stands for S(aturday|unday), Friday, T(uesday|hursday), Wednesday and Monday.

Simple code for this :

def s_first_letter_of_day(lines, guess):
    """ Check if next char is the first letter of the day """
    if False in [ u[0] == datetime.utcfromtimestamp(t).strftime("%A")[0] for t,d,u in lines ]:
        return False
    next_char=datetime.utcfromtimestamp(guess[0]).strftime("%A")[0]
    new_guess=(guess[0],guess[1],guess[2]+next_char)
    return (remove_char(lines,1), new_guess)

3 bits to integer

Next one was simple too. You need to convert timestamp to binary, then just uses 3 first bits and convert them to integer. Just, the binary number must be left-padded to 31 bits.

Ex:

#(372720574, '1981/10/23 21:29:34', 'http://F1Mg61r0CSi80.dgeh')
>> int(format(372720574,"031b")[0:3],2)
1
#(937075529, '1999/09/11 18:45:29', 'http://S3PB0r0CAl90.dgeh')
>> int(format(937075529,"031b")[0:3],2)
3

Code for this selector is :

def s_3bits_to_int(lines, guess):
    """ Check if next char is a number which is the same as next 3 bits """
    global current_bit
    for t,d,u in lines:
        bin_t=format(t,"031b")[current_bit:current_bit+3]
        if u[0] not in string.digits: # test it is a number
            return False
        if int(u[0]) != int(bin_t,2):
            return False
    next_char=str(int(format(guess[0],"031b")[current_bit:current_bit+3],2))
    new_guess=(guess[0],guess[1],guess[2]+next_char)
    current_bit += 3
    return (remove_char(lines,1), new_guess)

Note : It is important to understand that the position of the bits will depend of previously used selectors. In this example, this is the first selector using timestamp’s binary version, so we use the 3 first bits. But, if any other selector would have been used before, we’d take the 3 “next” bits.

Periodic table

This one was really funny. You can notice that the next char is an uppercase letter, followed by one of uppercase, lowercase or number :/

Some sample should ring some bell in your mind, like Mg, Si or Mn … Yes, you got it, these are part of the periodic table (crazy author …)

After I spent way too much time on it, I understood that it is based on either :

converted to integer. This gives us the atomic number of the element. Depending of the element, it can be a one or 2 chars object.

The code for these 2 cases (4 and 5 bits):

def s_periodic_table_4(lines, guess):
    return s_periodic_table(lines, guess, 4)

def s_periodic_table_5(lines, guess):
    return s_periodic_table(lines, guess, 5)

def s_periodic_table(lines, guess, n):
    """ Check if next (two) char(s) exists in periodic table and the next 'n'
    bits correspond to their number in this table """
    global current_bit
    table={
        "Nu": 0, # not a real one
        "H": 1,    "He": 2,   "Li":  3, "Be": 4,   "B": 5,    "C": 6,    "N": 7,    "O": 8,
        "F": 9,    "Ne": 10,  "Na": 11, "Mg": 12,  "Al": 13,  "Si": 14,  "P": 15,   "S": 16,
        "Cl": 17,  "Ar": 18,  "K": 19,  "Ca": 20,  "Sc": 21,  "Ti": 22,  "V": 23,   "Cr": 24,
        "Mn": 25,  "Fe": 26,  "Co": 27, "Ni": 28,  "Cu": 29,  "Zn": 30,  "Ga": 31,  "Ge": 32,
    }
    for t,d,u in lines:
        bin_t=format(t,"031b")[current_bit:current_bit+n]
        elmt=u[0]
        if u[1] in string.ascii_lowercase and (u[0]+u[1]) in table.keys() and table[(u[0] + u[1])] <= 32:
            elmt += u[1]
        if elmt not in table.keys():
            #print("elmt : %s , int : %s" % (elmt,int(bin_t,2)))
            return False
        if table[elmt] != int(bin_t,2) and (elmt[0] in table.keys() and table[elmt[0]] != int(bin_t,2)):
            return False
    guess_bin_t=int(format(guess[0],"031b")[current_bit:current_bit+n],2)
    next_char=''
    for k,v in table.items():
        if v == guess_bin_t:
            next_char=k
    new_guess=(guess[0],guess[1],guess[2]+next_char)

    # I can't use remove_char() here as the number of char may change between lines
    res = []
    for t, d, u in lines:
        bin_t=format(t,"031b")[current_bit:current_bit+n]
        if u[1] in string.ascii_lowercase and (u[0] + u[1]) in table.keys() and table[(u[0] + u[1])] == int(bin_t,2):
            res.append((t,d,u[2:]))
        else:
            res.append((t,d,u[1:]))
    current_bit += n
    return (res, new_guess)

You’ll notice that:

Hex char

Next, a simple one, just take next 4 bits and write corresponding hex :

def s_hex_value(lines, guess):
    """ Check if next  char is hex value corresponding to next 4 bits"""
    global current_bit
    for t,d,u in lines:
        bin_t=format(t,"031b")[current_bit:current_bit+4]
        if u[0] not in string.hexdigits:
            return False
        if format(int(bin_t,2),"X") != u[0]:
            return False
    guess_bin_t=format(guess[0],"031b")[current_bit:current_bit+4]
    next_char=format(int(guess_bin_t,2),"X")
    new_guess=(guess[0],guess[1],guess[2]+next_char)
    current_bit += 4
    return (remove_char(lines,1), new_guess)

2/3 bits AND

Next, you’ll notice that you have only ‘0’ or ‘1’. These are in fact a logical AND between the next 2 or 3 bits (depends of the sample):

def s_2bits_and(lines, guess):
    """ Check if next char is the AND of the 2 next bytes """
    global current_bit
    for t,d,u in lines:
        bin_t=format(t,"031b")[current_bit:current_bit+2]
        if u[0] not in ["0","1"]:
            return False
        if int(u[0]) != int(bin_t[0]) & int(bin_t[1]):
            return False
    guess_bin_t=format(guess[0],"031b")[current_bit:current_bit+2]
    next_char=str(int(guess_bin_t[0]) & int(guess_bin_t[1]))
    new_guess=(guess[0],guess[1],guess[2]+next_char)
    current_bit += 2
    return (remove_char(lines,1), new_guess)

def s_3bits_and(lines, guess):
    """ Check if next char is the AND of the 3 next bytes """
    global current_bit
    for t,d,u in lines:
        bin_t=format(t,"031b")[current_bit:current_bit+3]
        if u[0] not in ["0","1"]:
            return False
        if int(u[0]) != int(bin_t[0]) & int(bin_t[1]) & int(bin_t[2]):
            return False
    guess_bin_t=format(guess[0],"031b")[current_bit:current_bit+3]
    next_char=str(int(guess_bin_t[0]) & int(guess_bin_t[1]) & int(guess_bin_t[2]))
    new_guess=(guess[0],guess[1],guess[2]+next_char)
    current_bit += 3
    return (remove_char(lines,1), new_guess)

Last letter of month

After the first-letter-of-day, we now have a similar one => last letter of month :

def s_last_letter_of_month(lines, guess):
    """ Check if next char is the last letter of the month """
    if False in [ u[0] == datetime.utcfromtimestamp(t).strftime("%B")[-1] for t,d,u in lines ]:
        return False
    next_char=datetime.utcfromtimestamp(guess[0]).strftime("%B")[-1]
    new_guess=(guess[0],guess[1],guess[2]+next_char)
    return (remove_char(lines,1), new_guess)

A,C,G,T

Next one is again a 0/1 from a AND between bits, already covered. Then, you have an uppercase letter, one of A,C,G,T.

Not to difficult, it just match the next 2 bits of timestamp. Be careful, however, that one value does not always map to the same letter. So I had to dynamically build the mapping, then check :

def s_acgt(lines, guess):
    """ Check if next char is one of ACGT and correspond to the next 2 bits of timestamp """
    global current_bit
    acgt_map={}
    for t,d,u in lines:
        if u[0] not in ['A','C','G','T']:
            return False
        bin_t=format(t,"031b")[current_bit:current_bit+2]
        if not bin_t in acgt_map.keys():
            acgt_map[bin_t] = u[0]
        if acgt_map[bin_t] != u[0]:
            return False
    if len(acgt_map) != 4:
        return False
    guess_bin_t=format(guess[0],"031b")[current_bit:current_bit+2]
    next_char=acgt_map[guess_bin_t]
    new_guess=(guess[0],guess[1],guess[2]+next_char)
    current_bit += 2
    return (remove_char(lines,1), new_guess)

Third number of the year

Then we again have the periodic table, and to finish we have 2 numbers. These are the decenie of the date (ex : 1982 => 80).

def s_year_third_number(lines, guess):
    """ Check if next 2 chars are the 3rd number of year and a "0" """
    global current_bit
    for t,d,u in lines:
        if len(u) == 1:
            return False
        if u[1] != "0":
            return False
        if u[0] != datetime.utcfromtimestamp(t).strftime("%y")[0]:
            return False
    next_char=datetime.utcfromtimestamp(guess[0]).strftime("%y")[0] + "0"
    new_guess=(guess[0],guess[1],guess[2]+next_char)
    return (remove_char(lines,2), new_guess)

Step 2

So with all these selectors, I was able to find the correct domain :

Your solution: http://F5Ti40e1CNe10.dgeh
[...]
Hmm we found another DGA, can you figure out the next one?

Here's the data that we got:
[...]

Yes, again … more samples ! At the end, there were 5 rounds. First had 50 samples, 2nd 40, then 30, 20 and 10.

The goal is always the same, and I just found an extra selectors to write :

True / False to letter

Sometimes you had only two (uppercase) letters. Corresponding to the next bit (0 => first letter, 1 => second letter). So I had to first build the mapping, then check everything match:

def s_true_false(lines, guess):
    """ Check if next letter corresponds to next bit (1/0) (one for 1, one for 0)"""
    global current_bit
    tf_map={}
    for t,d,u in lines:
        bin_t=format(t,"031b")[current_bit:current_bit+1]
        if not bin_t in tf_map.keys():
            tf_map[bin_t] = u[0]
        if tf_map[bin_t] != u[0]:
            return False
    guess_bin_t=format(guess[0],"031b")[current_bit:current_bit+1]
    next_char=tf_map[guess_bin_t]
    new_guess=(guess[0],guess[1],guess[2]+next_char)
    current_bit += 1
    return (remove_char(lines,1), new_guess)

Get the flag

At the end, I was too lazy to submit the result automatically as there was no time constraint, so I just finished my script with some local variable to set at the beginning of the file :

selectors = [
    s_acgt,
    s_last_letter_of_month,
    s_first_letter_of_day,
    s_3bits_to_int,
    s_2bits_and,
    s_3bits_and,
    s_periodic_table_5,
    s_periodic_table_4,
    s_true_false,
    s_hex_value,
    s_year_third_number
]

prepared_b = remove_http(b)

selector_order = []
while prepared_b[0][2]:
    for s in selectors:
        r = s(prepared_b, guess)
        if r:
            selector_order.append(s)
            prepared_b = r[0]
            guess=r[1]
            print(" `-> Selector %s worked. Current answer :%s " % (s.__name__, guess))
            break

print("Found selector order : %s" % ' '.join([a.__name__ for a in selector_order]))
print("Expected solution : %s.dgeh" % guess[-1])

And finally :

[40 samples]
And here's the one we need:
(1529078400, '2018/06/15 16:00:00', '?')

Your solution: http://10NaTCOGFNiNee.dgeh

Hmm we found another DGA, can you figure out the next one?
Here's the data that we got:

[30 samples]

And here's the one we need:
(1529078400, '2018/06/15 16:00:00', '?')


Your solution: http://FNaCG03101HeGe.dgeh
Hmm we found another DGA, can you figure out the next one?

Here's the data that we got:

[20 samples]

And here's the one we need:
(1529078400, '2018/06/15 16:00:00', '?')


Your solution: http://e5BBe0F1TNe10.dgeh
Hmm we found another DGA, can you figure out the next one?

Here's the data that we got:

[10 samples]

And here's the one we need:
(1529078400, '2018/06/15 16:00:00', '?')


Your solution: http://FNaMgG3eACl010.dgeh
You are the master of DGeh's! PAN{wh4t_w4s_th4t_dg3h_ab00t}