Meet self
In Python we have functions and methods.
Function definitions in Python look like this:
def sloganify(x):
return "{} or bust!".format(x)
And method definitions look like this:
class Person:
def sloganize(self, x):
return "{} or bust!".format(x)
Classes are buckets of functions
When we write Python code, we don’t really write methods; we write class statements which contain function definitions.
Class statements look like this:
class Person:
greeting = 'hello'
x = 4 + 3
which is syntactic sugar for something like this:
Person = type("Person", (), {'greeting': 'hello', 'x': 4 + 3})
So writing a class with a method
class Foo:
def bar(x, y, z):
return 7
is about the same as writing
def bar_function(x, y, z):
return 7
Foo = type("Foo", (), {'bar': bar_function})
del bar_function
We can also write this as
def bar_function(x, y, z):
return 7
Foo = type("Foo", (), {})
Foo.bar = bar_function
del bar_function
Method objects in Python
Despite all those examples trying to show that we write functions, not methods in Python, both function and method objects do exist:
>>> f = Foo()
>>> f.bar
<bound method bar_function of <__console__.Foo object at 0x113644128>>
>>> bar_function # if we forgo the `del bar_function` above
<function bar_function at 0x1135e66a8>
Besides having different names, these two different versions of the function we wrote take different numbers of arguments!
>>> import inspect
>>> inspect.signature(bar_function)
<Signature (x, y, z)>
>>> inspect.signature(f.bar)
<Signature (y, z)>
Is this some kind of class definition-time transformation? No, the function is indeed what gets stored in the class!
>>> Foo.bar
<function bar_function at 0x1135e66a8>
Besides, the method objects we get from different instances are completely different:
>>> f.bar
<bound method bar_function of <__console__.Foo object at 0x113644128>>
>>> g.bar
<bound method Foo.bar of <__console__.Foo object at 0x10a41a710>>
Maybe the method object is created when we create an instance of the class, and stored on the instance object!
>>> vars(f)
{}
Hm, I don’t see it anywhere…
Partial Application
The “one less argument” thing is familiar to people who use Python classes and have seen this perplexing error message:
>>> f.bar(1, 2, 3)
Traceback (most recent call last):
File "<input>", line 1, in <module>
f.bar(1, 2, 3)
TypeError: bar() takes 3 positional arguments but 4 were given
But I did pass three arguments! Perhaps you see where this is going. A method is a version of a function that takes one less argument, an example of the technique of “partial application.” Say we have the general rectangle drawing function below:
def draw_rect(color, width, height, x, y):
"""Draws a rectangle"""
...
If we wanted to make a specialized version of this function for drawing small blue square, we might write a new function that calls this old version:
def draw_small_blue_square(x, y):
"""Draws a small blue square at the passed coordinates"""
draw_rect('blue', 2, 2, x, y)
To be more concise and to avoid the extra layer of call stack in our error message stack traces we could use partial application instead:
>>> import functools
>>> draw_large_red_square = functools.partial(draw_rect, 'red', 100, 100)
In this version we didn’t write a docstring or a new signature ourselves.
inspect.signature
is smart enough to tell how to use this function:
>>> inspect.signature(draw_large_red_square)
<Signature (x, y)>
but unfortunately our error message still refers to the original version of the function:
>>> draw_large_red_square(1, 2, 3)
Traceback (most recent call last):
File "<input>", line 1, in <module>
draw_large_red_square(1, 2, 3)
TypeError: drawRect() takes 5 positional arguments but 6 were given
Partial application can be really convenient!
>>> force_print = functools.partial(print, file=sys.stderr, flush=True)
>>> intdict = functools.partial(collections.defaultdict, int)
Based on the similarity of these error messages, you might thing that this
method version of the function is implemented with by using
functools.partial
to partially apply the
instance to the first parameter of the original function object. Good guess!
I looked at the CPython code and it’s not. But the behavior is similar, it’s sort
of a special, written-in-C, optimized version of this.
Method binding rocks
So that’s what’s going on: Python is creating a method object that takes one less argument (and now holds a reference to the instance from which we got it) from our original Python function.
So method binding lets us write what could have been the clunky
f = Foo()
Foo.bar(f, 2, 3)
as
f = Foo()
f.bar(2, 3)
with both namespacing (the f
instance looks up to its class to find
this function) and method binding (we don’t have to re-specify the first
argument of f
.
This is creation of a bound method at attribute lookup time isn’t the only way
to make methods work: in JavaScript, to transformation of the function occurs
at function lookup time. To which object this
(JavaScript’s self
) refers
depends on the syntax used to call it instead. This is often undesirable,
so the function is often transformed with bind
.
Why is Python’s behavior preferable?
>>> distances = {'5k': 5000, 'mile': 1609.34, 'marathon': 42195}
>>> distances.keys().sort(key=distances.get)
>>> import threading
>>> t = threading.thread(target=crawler.crawl)
>>> t.start()
Because callbacks - one of the words used to describe passing a function as a value.
Sometimes Python apis expect a something callable as a passed argument.
We could wrap it in another layer of function, .sort(lambda x: distance.get(x))
in the first example above, but it’s nice not to have to.
That pattern is more common in JavaScript:
> setTimeout(foo.bar) # bar won't be called correctly
> setTimeout(function(){ foo.bar() });
How it happens: the descriptor protocol
So method binding is cool. And a bit mysterious: somehow attribute lookup is
more complicated that we suspected, because it’s somehow causing a method
to be created!
It seems like looking for the attribute
bar
on an instance f
would require traversing a chain like
instance -> class -> parent class -> ... -> object
and returning the first matching attribute on one of these objects. I remember being stunned to discover this simple model did not reflect reality.
In fact the value returned from this process may instead be the result of arbitrary code by implementing something known as the descriptor protocol. Although the code for creating methods is written in C, the power of descriptors is available to use in Python, so let’s explore it from that direction.
If an attribute implements the descriptor protocol, it will have
a __get__()
method defined which will be called with information
about the lookup and the result returned as the result of the
attribute lookup.
>>> class ImADescriptor(): # because I have a __get__ method
... def __get__(self, instance, owner):
... return 7
...
>>> class Foo():
... bar = ImADescriptor()
...
>>> f = Foo()
>>> f.bar
7
The __get__
method is called by Python infernal (whoops, internal)
attribute lookup logic in ceval.c
with a reference to the object that the attribute lookup started on so
it can be used to create our bound method.
It turns out that functions implement this descriptor protocol!
Take a look at the __get__
method on a function:
>>> drawRect.__get__
<method-wrapper '__get__' of function object at 0x114414268>
If we were writing out that method in Python, it might look something like
class Function():
def __get__(self, instance, owner):
return functools.partial(self, instance)
So that’s how methodization works in Python!