Preface
Recently, I felt that Python was too “simple”, so I said to my teacher, Chuan Ye: “I think Python is the simplest language in the world!”. Chuan Ye smiled with contempt (inner thoughts: Naive! As a Python developer, I must give you some life experience, otherwise you will not know the world!) So Chuan Ye gave me a set of questions with a full score of 100, and this article is to record the pitfalls I encountered when doing this set of questions.
1. List Builder
describe
The following code will report an error, why?
class A(object):
x = 1
gen = (x for _ in xrange(10)) # gen=(x for _ in range(10))
if __name__ == "__main__":
print(list(A.gen))
Answer
This problem is a variable scope problem. In gen=(x for _ in xrange(10))
, variables have their own scope, which is isolated from the rest of the scope space. Therefore, there will be such a problem, so what is the solution? The answer is: use lambda.gen
generator
generator
NameError: name 'x' is not defined
class A(object):
x = 1
gen = (lambda x: (x for _ in xrange(10)))(x) # gen=(x for _ in range(10))
if __name__ == "__main__":
print(list(A.gen))
2. Decorator
describe
I want to write a class decorator to measure function/method running time
import time
class Timeit(object):
def __init__(self, func):
self._wrapped = func
def __call__(self, *args, **kws):
start_time = time.time()
result = self._wrapped(*args, **kws)
print("elapsed time is %s " % (time.time() - start_time))
return result
This decorator can be run on normal functions:
@Timeit
def func():
time.sleep(1)
return "invoking function func"
if __name__ == '__main__':
func() # output: elapsed time is 1.00044410133
But an error will be reported when running on the method, why?
class A(object):
@Timeit
def func(self):
time.sleep(1)
return 'invoking method func'
if __name__ == '__main__':
a = A()
a.func() # Boom!
If I insist on using class decorators, how should I modify it?
Answer
After using the class decorator, func
the corresponding instance will not be passed to __call__
the method during the call of the function, causing it mehtod unbound
, so what is the solution? Descriptor Saigao
class Timeit(object):
def __init__(self, func):
self.func = func
def __call__(self, *args, **kwargs):
print('invoking Timer')
def __get__(self, instance, owner):
return lambda *args, **kwargs: self.func(instance, *args, **kwargs)
3. Python calling mechanism
describe
We know __call__
that methods can be used to overload parentheses calls. OK, thought the problem was that simple? Naive!
class A(object):
def __call__(self):
print("invoking __call__ from A!")
if __name__ == "__main__":
a = A()
a() # output: invoking __call__ from A
Now we can see a()
that seems to be equivalent to a.__call__()
, looks very easy, right? Okay, now I want to die, and write the following code,
a.__call__ = lambda: "invoking __call__ from lambda"
a.__call__()
# output:invoking __call__ from lambda
a()
# output:invoking __call__ from A!
Could you please explain why a()
it is not called out? a.__call__()
(This question was raised by USTC senior Wang Zibo)
Answer
The reason is that in Python, the built-in special methods of new classes are isolated from the instance attribute dictionary. For details, see the Python official documentation for this situation.
For new-style classes, implicit invocations of special methods are only guaranteed to work correctly if defined on an object’s type, not in the object’s instance dictionary. That behaviour is the reason why the following code raises an exception (unlike the equivalent example with old-style classes):
At the same time, the official also gave an example:
class C(object):
pass
c = C()
c.__len__ = lambda: 5
len(c)
# Traceback (most recent call last):
# File "", line 1, in
# TypeError: object of type 'C' has no len()
Back to our example, when we execute a.__call__=lambda:"invoking __call__ from lambda"
, we do a.__dict__
add a new __call__
item with key in , but when we execute a()
, because it involves the call of a special method, our calling process will not look for attributes a.__dict__
from , but tyee(a).__dict__
from . Therefore, the situation described above will occur.
4. Descriptors
describe
I want to write an Exam class whose attribute math is an integer in [0,100]. If the value assigned is not in this range, an exception is thrown. I decided to use a descriptor to implement this requirement.
class Grade(object):
def __init__(self):
self._score = 0
def __get__(self, instance, owner):
return self._score
def __set__(self, instance, value):
if 0 <= 0="" 75="" 90="" value="" <="100:" self._score="value" else:="" raise="" valueerror('grade="" must="" be="" between="" and="" 100')="" exam(object):="" math="Grade()" def="" __init__(self,="" math):="" self.math="math" if="" __name__="=" '__main__':="" niche="Exam(math=90)" print(niche.math)="" #="" output="" :="" snake="Exam(math=75)" print(snake.math)="" snake.math="120" output:="" valueerror:grade="" 100!<="" code="">
Everything looks fine. But there is a huge problem here. Let me try to figure out what the problem is.
To solve this problem, I rewrote the Grade descriptor as follows:
class Grad(object):
def __init__(self):
self._grade_pool = {}
def __get__(self, instance, owner):
return self._grade_pool.get(instance, None)
def __set__(self, instance, value):
if 0 <= value="" <="100:" _grade_pool="self.__dict__.setdefault('_grade_pool'," {})="" _grade_pool[instance]="value" else:="" raise="" valueerror("fuck")<="" code="">
But this will lead to bigger problems. How can I solve this problem?
Answer
1. The first question is actually very simple. If you run it again, print(niche.math)
you will find that the output value is 120
, so why is this? This starts with Python’s calling mechanism. If we call an attribute, the order is to first __dict__
search from the instance’s , and then if it is not found, then query the class dictionary, the parent class dictionary, until it is completely unavailable. Okay, now back to our problem, we found that in our class Exam
, self.math
the calling process of its is to first search in the instantiated instance’s __dict__
, if it is not found, then go up one level and search in our class Exam
, if it is found, return. So this means that self.math
all our operations on are math
operations on the class variable . Therefore, it causes the problem of variable pollution. So how to solve this? Many comrades may say, well, __set__
isn’t it enough to set the value to the specific instance dictionary in the function?
So is this possible? The answer is, obviously, it can’t work. As for why, it involves the mechanism of our Python descriptors. Descriptors refer to special classes that implement the descriptor protocol. The three descriptor protocols refer to __get__
, ‘ set ‘, __delete__
and the new __set_name__
methods in Python 3.6. Among them, those that implement __get__
and __set__
// are Data descriptors__delete__
, while those that only implement are . So what is the difference? As mentioned earlier, if we call an attribute, the order is to search from the instance’s , and if it is not found, then query the class dictionary, the parent class dictionary, and so on until it is completely unavailable. However, the descriptor factor is not taken into account here. If the descriptor factor is taken into account, then the correct statement should be that if we call an attribute, the order is to search from the instance’s , and if it is not found, then query the class dictionary, the parent class dictionary, and so on until it is completely unavailable. If the attribute in the class instance dictionary is one , then no matter whether the attribute exists in the instance dictionary or not, the descriptor protocol is called unconditionally. If the attribute in the class instance dictionary is one , then the attribute value in the instance dictionary is called first without triggering the descriptor protocol. If the attribute value does not exist in the instance dictionary, then the descriptor protocol is triggered . Back to the previous question, even if we write the specific attribute into the instance dictionary, it exists in the class dictionary . Therefore, when we call the attribute, the descriptor protocol will still be triggered.__set_name__
__get__
Non-Data descriptor
__dict__
__dict__
Data descriptors
Non-Data descriptors
Non-Data descriptor
__set__
Data descriptors
math
2. The improved approach uses dict
the uniqueness of the key to bind the specific value to the instance, but it also brings about the problem of memory leak. So why does a memory leak occur? First, let’s review dict
the characteristics of our . dict
The most important characteristic is that any hashable object can be a key. dict
The uniqueness of the hash value is used (strictly speaking, it is not unique, but the probability of its hash value collision is extremely small, so it is approximately considered unique) to ensure the non-repetitiveness of the key. At the same time (pay attention, here comes the point), the reference dict
in key
is a strong reference type, which will cause the reference count of the corresponding object to increase, which may cause the object to be unable to be GC, resulting in a memory leak. So how to solve this problem? Two methods
. The first one:
class Grad(object):
def __init__(self):
import weakref
self._grade_pool = weakref.WeakKeyDictionary()
def __get__(self, instance, owner):
return self._grade_pool.get(instance, None)
def __set__(self, instance, value):
if 0 <= value="" <="100:" _grade_pool="self.__dict__.setdefault('_grade_pool'," {})="" _grade_pool[instance]="value" else:="" raise="" valueerror("fuck")<="" code="">
WeakKeyDictionary
The reference of the dictionary key to the object generated by the weakref library is a weak reference type, which will not cause the increase of the memory reference count, so it will not cause memory leaks. Similarly, if we want to avoid the strong reference of the value to the object, we can use it WeakValueDictionary
.
The second method: In Python 3.6, the PEP 487 proposal implemented adds a new protocol for descriptors, which we can use to bind the corresponding objects:
class Grad(object):
def __get__(self, instance, owner):
return instance.__dict__[self.key]
def __set__(self, instance, value):
if 0 <= value="" <="100:" instance.__dict__[self.key]="value" else:="" raise="" valueerror("fuck")="" def="" __set_name__(self,="" owner,="" name):="" self.key="name
This question involves a lot of things. Here are some reference links: invoking-descriptors , Descriptor HowTo Guide , PEP 487 , what’s new in Python 3.6 .
5. Python inheritance mechanism
describe
Try to find the output of the following code.
class Init(object):
def __init__(self, value):
self.val = value
class Add2(Init):
def __init__(self, val):
super(Add2, self).__init__(val)
self.val += 2
class Mul5(Init):
def __init__(self, val):
super(Mul5, self).__init__(val)
self.val *= 5
class Pro(Mul5, Add2):
pass
class Incr(Pro):
csup = super(Pro)
def __init__(self, val):
self.csup.__init__(val)
self.val += 1
p = Incr(5)
print(p.val)
Answer
The output is 36. For details, please refer to New-style Classes , multiple-inheritance
6. Python Special Methods
describe
I wrote a class that implements the singleton pattern by overloading the new method.
class Singleton(object):
_instance = None
def __new__(cls, *args, **kwargs):
if cls._instance:
return cls._instance
cls._isntance = cv = object.__new__(cls, *args, **kwargs)
return cv
sin1 = Singleton()
sin2 = Singleton()
print(sin1 is sin2)
# output: True
Now I have a bunch of classes that I want to implement as singletons, so I plan to write a metaclass that will allow code reuse:
class SingleMeta(type):
def __init__(cls, name, bases, dict):
cls._instance = None
__new__o = cls.__new__
def __new__(cls, *args, **kwargs):
if cls._instance:
return cls._instance
cls._instance = cv = __new__o(cls, *args, **kwargs)
return cv
cls.__new__ = __new__o
class A(object):
__metaclass__ = SingleMeta
a1 = A() # what`s the fuck
Oops, I’m so angry, why is this error? I clearly used this method to __getattribute__
patch it before. The following code can capture all attribute calls and print the parameters
class TraceAttribute(type):
def __init__(cls, name, bases, dict):
__getattribute__o = cls.__getattribute__
def __getattribute__(self, *args, **kwargs):
print('__getattribute__:', args, kwargs)
return __getattribute__o(self, *args, **kwargs)
cls.__getattribute__ = __getattribute__
class A(object): # Python 3 是 class A(object,metaclass=TraceAttribute):
__metaclass__ = TraceAttribute
a = 1
b = 2
a = A()
a.a
# output: __getattribute__:('a',){}
a.b
Explain why patching getattribute succeeds, but patching new
fails. If I insist on using metaclasses to patch new to implement the singleton pattern, how should I modify it?
Answer
In fact, this is the most annoying point. The in the class __new__
is a staticmethod
, so when replacing it, it must staticmethod
be replaced with . The answer is as follows:
class SingleMeta(type):
def __init__(cls, name, bases, dict):
cls._instance = None
__new__o = cls.__new__
@staticmethod
def __new__(cls, *args, **kwargs):
if cls._instance:
return cls._instance
cls._instance = cv = __new__o(cls, *args, **kwargs)
return cv
cls.__new__ = __new__o
class A(object):
__metaclass__ = SingleMeta
print(A() is A()) # output: True
Conclusion
Thanks to Master for a set of questions that opened the door to a new world. Well, I can’t mention him on the blog, so I can only express my gratitude. To be honest, Python’s dynamic characteristics allow it to use many black magic
to implement some very comfortable functions. Of course, this also makes our grasp of language characteristics and pitfalls more stringent. I hope all Pythoners will read the official documents when they have nothing to do, and soon reach the realm of pretending to be as cool as the wind and always accompanying me .