sábado, 7 de marzo de 2015

The Isomorphic Response Reference Pattern based on MongoDB using Dart

I am relatively new to server-side web programming, and I am enjoying it: I am coding a server in Dart using MongoDart and the fantastic Redstone Dart framework. I recently remade all may server code because I discovered this great video by Les Hazlewood about REST API's


The thing that really shocked me about this video (well one of them, I was really searching for security and authentication) was a pattern shown here to create references between "server" entities.

Normally when you need to create reference between entities, you just store the database id as a string or int of the other object, and you end up writing classes like

 class A   
  {   
   String idB;   
   B b;   
  }   

A very intuitive approach, but it creates duplicate fields on A, and since B also has its own id field then it creates redundant information. To avoid this you can use what I'll call the Reference Pattern and as a class can be represented (in Dart) as

 abstract class Ref extends DbObj
 {  
   String get href;  
 }
where DbObj is
 class DbObj extends Resp  
 {  
   String id;  
 }  
The idea is that in JSON a Ref this looks something like
 {   
   "id" : "someId"  
   "href" : "http://someHost/someResource/{id}   
  }  
Now, if you make both A and B inherit from Ref then, they they could be defined as

  class A extends Ref   
  {   
   String someField;  
   B b;   
   String get href => "http://someHost/A/$id";  
  }   
  class B extends Ref   
  {   
   String someOtherField;  
   String get href => "http://someHost/B/$id";  
  }   

Where is the magic? The important thing is that when you store an A object on MongoDB, remember to only store a blank B object inside that only contains its own id field. Now you can retrieve it, encode it, and the client can receive it as a JSON like this

 {  
   "someField" : "whatever",  
   "b" : {  
     "id" : "id_of_B",  
     "href" : "http://someHost/B/id_of_B"  
   },  
   "id" : "id_of_A",  
   "href" : "http://someHost/A/id_of_A"  
 }  
}

Notice 2 things:

  1. You can plug the href's directly into a request, src tag, or whatever, making it super easy to get, put, or delete the entity.
  2. You can expand "b" to get the full object after making a get request to the its "href" and store it back on "b" again. Also, you can send a partial or complete B object from the beginning without have to create an extra structure to compensate.
Now you can receive a JSON like this with no problems

 {  
   "someField" : "whatever",  
   "b" : {  
     "id" : "id_of_B",  
     "href" : "http://someHost/B/id_of_B",  
     "someOtherField" : "Nice!"  
   },  
   "id" : "id_of_A",  
   "href" : "http://someHost/A/id_of_A"
 }  

They key is: all your structures inherit from `Ref` so they all contain the minimal data needed to retrieve/modify them, but they can be expanded. I other words, your object are isomorphic!

If you've read carefully you'd notice that the title takes about a "Response" and there is an undefined "Resp" class. This Resp class have the purpose of encapsulating the possibility of all requests to the server: failure. Ok, you can let the server crash and send back a 4xx or 5xx status, then on the client you have to catch that error on the request, but things might get messy. Another option is to send back a JSON with the error, something like

 {"error" : "not found"}  

then decode the JSON to an object and just check for the error. Here is the implementation of the Resp class

 class Resp  
 {  
   bool get success => nullOrEmpty(error);  
   bool get failed => ! success;  
   String error;  
 }  

Since Ref extends DbObj which extends Resp, all your objects can also warn you about errors and you can write beautiful code like this (which also uses Darts async/await syntax)


 A a = await getA ();  
 a.b = await getRequest (a.b.href); 
 if (a.b.failed)  
 {  
   print (a.b.error);  
   return;  
 }
 //Do something with a.b


martes, 4 de febrero de 2014

Python, PyBrain, Cython and CyBrain

The best thing about Python is that its a diverse language where some use it to create commercial web apps in Django, some to teach programming, and there are those in the scientific community who seek an open source replacement for Matlab.
When I came in touch with Python I belong to this last group; my main motivation to use this language was its growing popularity in Artificial Intelligence or Machine Learning. In this terrain packages like Numpy, SciPy and the SciKits collection have been doing a great job giving consistent tools to the scientific community.

In relation to Data Mining and Machine Learning amongst Python offers is Scikit-Learn, an extremely well documented module, with some youtube tutorials and a very smart team behind it. When I first started to wonder the machine learning world, my main drive soon became neural networks, and not wanting to depend on Matlab I gave my hopes to Scikit-learn. But things didn't go so fluidly. It seems that they initially had plans to support neural nets and at some point began to implement them, but soon decided not to and threw the ball to PyBrain; a package specialised on NN and AI.

After searching on many forums, and trying many modules, I finally settled with PyBrain. Their "modular" philosophy is great: you build neural networks as if they were Legos by you creating layers, make connections amongst them in any (consistent) order you want,  adding them to a network, and finally training it. On the long run, a package like this is needed to do modern machine learning because of its capability to create deep networks with a custom architecture.
As pretty as PyBrain may be, it has a huge achilles heel: PYTHON IS SLOW! PyBrain is written in pure Python and you will hit a dead end if you need wings. The truth is that large scale neural nets are one of those real world examples where you can have a function with millions of parameters that will require optimisation by running through thousands training cases. But you don't need to go to that extreme to see the wall, just create a network with about 2 or 3 hidden layers, each with about 5 neurons, and you will feel the pain of waiting lots of seconds for the console to pop-up the answer. Becuase of this limitation, PyBrain is at best and educational -maybe not even scientific- package, since using PyBrain in a real world scenario would be unreasonable.

PyBrain's philosophy is great, but PyBrain itself may not suite my purposes. That is why I decided to start CyBrain, a neural networks module inspired by PyBrain, written in Cython. Now for those who don't know what Cython is I will just say its an inch close to being the "perfect language". Formally Cython is superset of the Python language that compiles pythonic code to optimised C. By superset it means that (except for generators) every python statement a valid cython statement, however, not all cython statements are valid python statements. The real deal is that Cython give you the opportunity to write pythonistic code and pseudo-C code in any mixed way you want, specifically Cython lets you write C TYPES!

When you write cython code you feel you are connecting two foreign realms and at first it is thrilling and confusing. The first thing you automatically do is test the speed; its like driving Ferrari, even if you don't like cars you are bound to hit the accelerator. Cython is fast, C fast. The first bit is a little rough, since cython is a compiled language you have to arrange all the parts in a setup. The documentation helps in this first stage but I really took of after this 4 part youtube tutorial from a guy at Enthought. Pass that initial trial, relax and watch you python code run from 1.4x to 7x faster; then add some type and feed the wind of 100x+ speed!!!

Back to CyBrain, I just finished the basic parts for what you could call a minimum viable product. I don't a lot about Pybrain's internals, while I did download the project since it's open source, looking at unknown code become boring after a few minutes unless you want to fix it. My main design inspiration came from a section in the docs where they taught you to create you own custom neurons by subclassing the Neuron class and overriding some functions; these functions where the ones that gave me the hints.
In Cython you can do some nasty tricks like use pointers... pointers!!! This is heaven and hell at the same time. You have to malloc them (ahh!!!) but then you can insert them in C++ vectors -which Cython supports- and for free have variables like floats act like modern objects. This might not seem directly useful, state sharing is very efficient for some applications: weight sharing technique is really easy with this.

Any way, this was a long digression. I haven't compared speed yet, but results seem promising. If you want to fork the code, go ahead, here is the github link to CyBrain. Feedback is welcomed.

viernes, 27 de diciembre de 2013

Borg, Singleton and Unique Pattern's in Python (BETA)

**Introduction under construction**
Here I implement and compare the 3 patterns:
Unique
#Unique Pattern
class Unique:
#Define some static variables here
    x = 1
    @classmethod
    def init(cls):
        #Define any computation performed when assigning to a "new" object
        return cls
Singleton
#Singleton Pattern
class Singleton:

    __single = None 

    def __init__(self):
        if not Singleton.__single:
            #Your definitions here
            self.x = 1 
        else:
            raise RuntimeError('A Singleton already exists') 

    @classmethod
    def getInstance(cls):
        if not cls.__single:
            cls.__single = Singleton()
        return cls.__single
Borg
#Borg Pattern
class Borg:

    __monostate = None

    def __init__(self):
        if not Borg.__monostate:
            Borg.__monostate = self.__dict__
            #Your definitions here
            self.x = 1

        else:
            self.__dict__ = Borg.__monostate
Test
#SINGLETON
print "\nSINGLETON\n"
A = Singleton.getInstance()
B = Singleton.getInstance()

print "At first B.x = {} and A.x = {}".format(B.x,A.x)
A.x = 2
print "After A.x = 2"
print "Now both B.x = {} and A.x = {}\n".format(B.x,A.x)
print  "Are A and B the same object? Answer: {}".format(id(A)==id(B))


#BORG
print "\nBORG\n"
A = Borg()
B = Borg()

print "At first B.x = {} and A.x = {}".format(B.x,A.x)
A.x = 2
print "After A.x = 2"
print "Now both B.x = {} and A.x = {}\n".format(B.x,A.x)
print  "Are A and B the same object? Answer: {}".format(id(A)==id(B))


#UNIQUE
print "\nUNIQUE\n"
A = Unique.init()
B = Unique.init()

print "At first B.x = {} and A.x = {}".format(B.x,A.x)
A.x = 2
print "After A.x = 2"
print "Now both B.x = {} and A.x = {}\n".format(B.x,A.x)
print  "Are A and B the same object? Answer: {}".format(id(A)==id(B))
Output:
SINGLETON
At first B.x = 1 and A.x = 1
After A.x = 2
Now both B.x = 2 and A.x = 2

Are A and B the same object? Answer: True

BORG

At first B.x = 1 and A.x = 1
After A.x = 2
Now both B.x = 2 and A.x = 2

Are A and B the same object? Answer: False

UNIQUE

At first B.x = 1 and A.x = 1
After A.x = 2
Now both B.x = 2 and A.x = 2

Are A and B the same object? Answer: True
In my opinion, Unique implementation is the easiest, then Borg and finally Singleton with an ugly number of two functions needed for its definition.