domingo, 22 de diciembre de 2013

Self-Modifying Code in Python

Self-modifying code seems a scary sci-fi idea where computers write their own programs and break loose of human control. In reality, there is nothing particular about self-modifying code apart that it is hard to generate in most programming languages. Think about it, you can modify files on your computer, but isn't the executable program just another file?
I am not an expert in computer architecture, but my knowledge about computation tells me that all the computer does is read and write data on some memory storage unit; when it boils down to zeros and ones, writing on top of the machine instruction set is just like writing anywhere else, just that the consequences are different. In low-level languages like Assembly to modify the code of instructions is easy since the language itself contains operations for this purpose. (Read more: Assembly)
Anyway, getting to the AI dream (or nightmare) of truly intelligent machines that upgrade their own code and become more intelligent does not seem in sight. On one of the Ulam Lectures at Santa Fe Institute (Lecture Video), the researcher talked about its incredible work with a bug-solver program that modified a the source code of another program using a genetic algorithm to try to pass all the test cases that measured the 'correctness' of the program. To get good results the program actually modified the semantic tree of the code, not symbol by symbol, and had operations like copy, move, and delete branches, thus it didn't actually create "new" code, but modified existing code the best it could. It turns out that about 70% of the bugs in most bug databases could be solved, the small print of this is that most only required changing about 5 lines of code, but that is a good thing for us.
Intrigued by this paradigm and my new love for Python, I wrote Python class named "Code" to test pythons dynamic abilities. The key ingredient is the function exec() which takes as a input argument a string and executes it in the python interpreter. The basic trick is show below:
var = 'x'
equals = '=' 
val = '2'
exec( var + equal + val )
This creates a variable named x with a value of 2. 

Here is an example of what the class can do; at the end you can find the code for the class itself. In this example, a conditional checks if the variable x == 1, if the condition is met it proceeds to assign x to a value of 2, and after that it erases the whole conditional block. 

This first part creates the code-objet and adds statements to it:
#Init Variables
x = 1
#Create Code-object
code = Code()
code + 'global x, code' #Adds a new Code instance code[0] with this line of code => internally             code.subcode[0]
code + "if x == 1:"     #Adds a new Code instance code[1] with this line of code => internally code.subcode[1]
code[1] + "x = 2"       #Adds a new Code instance 0 under code[1] with this line of code => internally code.subcode[1].subcode[0]
code[1] + "del code[1]" #Adds a new Code instance 0 under code[1] with this line of code => internally code.subcode[1].subcode[1]
This next part prints the python code and show the the value of x. It just shows you the structure, it doest execute it.
#Prints
print "Initial Code:"
print code
print "x = " + str(x)
Output 1:
Initial Code:

global x, code
if x == 1:
    x = 2
    del code[1]

x = 1
As you can see, it contains the conditional if block that assigns a value to x. The last line 'x = 1' is not a statement, as you can se in the actual code, its just a print that shows you the value of x.
The next segment actually execute the code pretty easily and then prints it.
print "Code after execution:"
code() #Executes code
print code
print "x = " + str(x)
Output 2:
Code after execution:

global x, code

x = 2
As you can see, the code changed the variable x to the value 2, but most importantly it deleted the whole conditional block! A use of this could be to avoid checking for conditions once they are met. A better use would be to create a meta-program that writes the actual program or have autonomous coroutines that modify themselves gradually to give a better performance depending the history of use.
class Code:

    def __init__(self,line = '',indent = -1):

        if indent < -1:
            raise NameError('Invalid {} indent'.format(indent))

        self.strindent = ''
        for i in xrange(indent):
            self.strindent = '    ' + self.strindent

        self.strsubindent = '    ' + self.strindent

        self.line = line
        self.subcode = []
        self.indent = indent


    def __add__(self,other):

        if other.__class__ is str:
            other_code = Code(other,self.indent+1)
            self.subcode.append(other_code)
            return self

        elif other.__class__ is Code:
            self.subcode.append(other)
            return self

    def __sub__(self,other):

        if other.__class__ is str:
            for code in self.subcode:
                if code.line == other:
                    self.subcode.remove(code)
                    return self


        elif other.__class__ is Code:
            self.subcode.remove(other)


    def __repr__(self):
        rep = self.strindent + self.line + '\n'
        for code in self.subcode: rep += code.__repr__()
        return rep

    def __call__(self):
        print 'executing code'
        exec(self.__repr__())
        return self.__repr__()


    def __getitem__(self,key):
        if key.__class__ is str:
                for code in self.subcode:
                    if code.line is key:
                        return code
        elif key.__class__ is int:
            return self.subcode[key]

    def __delitem__(self,key):
        if key.__class__ is str:
            for i in range(len(self.subcode)):
                code = self.subcode[i]
                if code.line is key:
                    del self.subcode[i]
        elif key.__class__ is int:
            del self.subcode[key]

No hay comentarios:

Publicar un comentario