Hello, Python!
I didn’t believe it when a professor told us that we would pick up Python at some time in college. Well, here we are.
1. Whetting Your Appetite
The name Python is from a BBC show and has nothing to do with reptiles!
2. Using the Python Interpreter
The command-line arguments are exposed in the list sys.argv
. You can access it by executing import sys
. The length of the list is at least one. sys.argv[0]
is related to how Python is invoked:
- When Python is invoked by the bare command
python
,sys.argv[0]
is an empty string. - When Python is invoked with
python -c <command>
,sys.argv[0]
is'-c'
. - When Python is invoked with
python -
(which means to read the script from standard input),sys.argv[0]
is'-'
. - When Python is invoked with
python -m <module_name>
,sys.argv[0]
is the full name of the located module. - When Python is invoked with
python <file_name>.py
,sys.argv[0]
is<file_name>.py
.
The other options given to the python
command can be accessed in sys.argv
, starting from index 1.
3. An Informal Introduction to Python
Using Python as a Calculator
Division (/
) always returns a float. To get an integer, use floor division (//
).
4. More Control Flow Tools
for
Statements
Python’s for
statement is used to iterate over the items of a given sequence.
To modify a sequence while looping over it, it is useful to loop over a copy of the sequence or create a new collection.
users = {'Amy': 'active', 'Bob': 'inactive', 'Cindy': 'active'}
for user, status in users.copy().items():
if status == 'inactive':
del users[user]
# OR
active_users = {}
for user, status in users.items():
if status == 'active':
active_users[user] = status
The range()
Function
range()
behaves like a list, but it isn’t. It is an object which returns the successive numbers when you iterate over it, but it doesn’t make an actual list. Such an object is called an iterable.
else
Clauses in Loops
Loop statements (for
and while
) may have an else
clause. It is executed when the loop terminates normally (not by a break
statement).
# Find prime numbers below 100
from math import floor, sqrt
for n in range(2, 100):
for x in range(2, floor(sqrt(n)) + 1):
if (n % x == 0):
break
else:
print(n)
match
Statements
A simple usage of match
statements:
def http_error(status):
match status:
case 400:
return 'Bad request'
case 401 | 403:
return 'Not allowed'
case 418:
return "I'm a teapot"
case _: # wildcard
return "Something's wrong"
Patterns (the expressions after case
) may contain new variables, which are bound to the subject (the variable after match
) if the pattern is matched. For example:
# point is an (x, y) tuple
match point:
case (0, 0):
print("Origin")
case (0, y):
print(f"Y={y}")
case (x, 0):
print(f"X={x}")
case (x, y):
print(f"X={x}, Y={y}")
case _:
raise ValueError("Not a point")
The pattern can be a constructor-like expression:
class Point:
x: int
y: int
def where_is(point):
match point:
case Point(x=0, y=0):
return 'origin'
case Point(x=0, y=y):
return f'Y={y}'
case Point(x=x, y=0):
return f'X={x}'
case Point():
return 'Somewhere else'
case _:
return 'Not a point'
The pattern may contain an if
clause, known as a “guard”, to further restrict the matching:
match point:
case Point(x, y) if x == y:
print(f'Y=X at {x}')
case Point(x, y):
print('Not on the diagonal')
Patterns can be sequences and support extended unpacking. For example, the pattern (x, y, *rest)
will bind a sequence of at least two items and binds extra items to rest
. *_
may be used to suggest that extra items don’t have to be bound.
Patterns can also be dictionaries and support extended unpacking (e.g., {'age': 15, **rest}
). Extra keys are automatically ignored, so **_
is not needed.
Subpatterns can be captured using the as
keyword. For example, (Point(x1, y1), Point(x2, y2) as p2)
will capture the second element of the input as p2
.
Patterns can be named constants, but these must be dotted names since they will not be interpreted as capture variables:
from enum import Enum
class Color(Enum):
RED = 'red'
BLUE = 'blue'
match color:
case Color.RED:
print('Roses')
case Color.BLUE:
print('Violets')
Defining Functions
When a function is executed, a new symbol table for the local variables in the function is created.
In a function, all variable assignments store the value in the local symbol table; variable references first look in the local symbol table, then in the symbol table of enclosing functions, then in the global symbol table, and finally in the table of built-in names.
If we want to assign global variables or variables in enclosing functions, we use global
or nonlocal
to name them first.
Arguments are passed by value, but the values are object references.
Functions that do not return values explicitly return None
.
More on Defining Functions
Default Argument Values
Default values provided in function declarations are only evaluated once:
def f(a, L=[]):
L.append(a)
return L
print(f(1))
print(f(2))
This will print
[1]
[1, 2]
If you don’t want to share the default across function calls, you can:
def f(a, L=None):
if L is None:
L = []
L.append(a)
return L
Keyword Arguments
Keyword arguments are passed with their parameter names. They should follow positional arguments, which are passed without parameter names and are bound to parameters according to their positions.
If a final formal parameter of the form *name
is present, it will receive a tuple containing the positional arguments beyond the parameter list. If a formal parameter of the form **name
is present, it will receive a dictionary containing the keyword arguments except for those corresponding to a formal parameter.
For example:
def cheeseshop(kind, *args, **kwargs) {
print('Kind:', kind)
for arg in args:
print(arg)
print('=' * 40)
for kwarg, value in keywords.items():
print(kwarg, ': ', value, sep='')
}
cheeseshop('Limburger', 'Onion', 'Cheese', 'Ham', client='Alice', shopkeeper='Bob')
The output:
Kind: Limburgur
Onion
Cheese
Ham
========================================
client: Alice
shopkeeper: Bob
By default, arguments can be passed by position or explicitly by keyword. You can specify how each argument can be passed by using /
and *
:
def f(pos1, pos2, /, pos_or_kwd, *, kwd1, kwd2):
--+-------- -+-------- -----+----
| | |
| Positional or keyword |
| +-- Keyword only
+-- Positional only
Unpacking Argument Lists
We can use the *-operator to unpack the arguments out of a list or a tuple:
args = [3, 6]
list(range(*args)) # => [3, 4, 5]
Or the **-operator to unpack keyword arguments out of a dictionary:
def total(*, price, total):
return price * total
d = {"price": 2, "quantity": 5}
total(**d) # => 10
Lambda Expressions
def make_incrementor(n):
return lambda x: x + n
f = make_incrementor(42)
f(12) # => 54
Lambda expressions can reference variables from the containing scope.
Docstring
The docstring is the string literal right after the function signature. It can be accessed at function_name.__doc__
.
5. Data Structures
More on Lists
>>> dir(list)
[..., 'append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']
A design principle of Python data structures is that all mutating methods have no return value. Therefore, method chaining is not supported.
Using Lists as Queues
It is possible to implement a deque with lists using l.pop(0)
and l.insert(0, x)
, but it is not efficient. Instead, use collections.deque
:
from collections import deque
queue = deque(['Amy', 'Bob', 'Cindy'])
queue.popleft()
queue.appendleft('Alice')
List Comprehensions
There’s a map()
function in Python:
squares = list(map(lambda x: x**2, range(10)))
We can use the equivalent list comprehension:
squares = [x**2 for x in range(10)]
A list comprehension contains a for
clause, then zero or more for
or if
clauses. The order of the clauses is the same as if in a normal for
loop:
[(x, y) for x in [1,2,3] for y in [3,1,4] if x != y]
# OR
combs = []
for x in [1,2,3]:
for y in [3,1,4]:
if x != y:
combs.append((x, y))
Note that the tuple (x, y)
must be parenthesized.
Nested List Comprehensions
E.g., to transpose a matrix:
matrix = [
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12]
]
transposed = [[row[i] for row in matrix] for i in range(len(matrix[0])]
The del
Statement
The del
statement can be used to remove slices from a list or clear the entire list:
a = [0, 1, 2, 3, 4, 5]
del a[2:4]
a # => [0, 1, 4, 5]
del a[:]
a # => []
del
can also be used to delete entire variables:
del a
After that, referencing a
will result in an error.
Tuples and Sequences
Tuples are immutable and, therefore, do not support item assignment. They usually contain a heterogeneous sequence of elements often accessed via unpacking or indexing. Lists are mutable, and their elements are usually homogeneous and accessed by iterating over the list.
Sets
To create a set, use set()
, not {}
, because it creates a dictionary.
Dictionaries
A dictionary’s keys must be immutable; if a tuple contains any mutable object either directly or indirectly, it cannot be used as a key.
The dict()
constructor receives a sequence of key-value tuples, or keyword arguments:
dict([('sape': 4139), ('guido', 4127)])
# OR
dict(sape=4139, guido=4127)
We can use dict comprehensions to create a new dictionary:
{x: x**2 for x in (2, 4, 6)} # => {2: 4, 4: 16, 6: 36}
Looping Techniques
Use items()
to retrieve key-value pairs from a dictionary:
d = dict()
for k, v in d.items():
print(k, v)
Use enumerate()
to obtain the position index:
l = list()
for i, v in enumerate(l):
print(i, v)
Use zip()
to loop over two or more sequences at the same time:
names = ['Amy', 'Bob', 'Cindy']
heights = ["6'1", "5'11", "6'9"]
for name, height in zip(names, heights):
print(name, height)
Use reversed()
or sorted()
to loop over a sequence in reverse or in sorted order.
Use set()
to eliminate duplicates.
More on Conditions
Comparision can be chained. E.g., a < b == c
.
6. Modules
A module is a file containing Python definitions and statements. The module’s name is available as the value of the global variable __name__
.
There are several ways to import a module:
import spam
from spam import func
from spam import * # This imports all names that do not begin with _
import spam as s # The module is available as s
More on Modules
Statements in modules are intended to initialize the module. They are execute only the first time the module name is encountered in an import
statement (or when the file is executed as a script). Each module is only imported once per interpreter session. To reload a module, use importlib.reload()
.
When you run a Python module with python <file_name>.py <arguments>
, the code in the module will be executed, but the __name__
is set to '__main__'
.
When a module name is imported, the interpreter first searches for a built-in module with that name. If not found, it searches in a list of directories given by the variable sys.path
, which is initialized from:
- The directory containing the input script (or the current directory when no file is specified)
- The shell variable
PYTHONPATH
- The installation-dependent default (by convention including a
site-packages
directory)
The dir()
Function
The built-in function dir()
finds out which names a module defines.
Without arguments, dir()
lists the names defined currently.
To list built-in functions and variables, import the module builtins
, and use dir(builtins)
.
Packages
Packages are a way of structuring Python’s module namespace by using dotted module names. For example, the module name A.B
designates a submodule named B
in a package named A
.
Packages are organised as directories in a filesystem. The __init__.py
file in a directory is required to make Python treat directories containing the file as packages. This file can just be empty, but it can also execute initialization code for the package. Suppose this is a package structure located in a folder specified in sys.path
:
sound/ Top-level package
__init__.py Initialize the sound package
formats/ Subpackage for file format conversions
__init__.py
wavread.py
wavwrite.py
aiffread.py
aiffwrite.py
auread.py
auwrite.py
...
effects/ Subpackage for sound effects
__init__.py
echo.py
surround.py
reverse.py
...
filters/ Subpackage for filters
__init__.py
equalizer.py
vocoder.py
karaoke.py
...
There are several ways to import a submodule in a package:
import sound.effects.echo
# It must be referenced with its full name
sound.effects.echo.echofilter()
# OR
from sound.effects import echo
# This is available without its package prefix
echo.echofilter()
# OR
# import the desired function or variable directly
from sound.effects.echo import echofilter
echofilter()
- When using
from package import item
,item
can be a name defined in the package, like a function, class or variable, If it isn’t, theimport
statement checks if it is a submodule or subpackage ofpackage
. If it isn’t, anImportError
is raised. Note that ifitem
is a subpackage, only the names defined or imported in its__init__.py
will be imported. - When using
import item.subitem.subsubitem
, each item except for the last must be a package; the last item can be a module or a package but can’t be a name defined in the previous item.
Importing *
From a Package
If a package’s __init__.py
defines a list named __all__
, it is taken to be a list of module names that should be imported when from package import *
is encountered.
Intra-package References
For the surround
module of the above example, we can use:
from . import echo
from .. import formats
from ..filters import equalizer
Note that relative imports rely on the name of the current module, which is read from __name__
. Since the name of the main module is always '__main__'
, modules intended for use as the main module must not use relative imports.
7. Input and Output
Fancier Output Formatting
The str()
function is meant to return representations of values which are fairly human-readable, while repr()
is meant to generate representations which can be read by the interpreter.
There are several ways to format output in Python.
Formatted String Literals
Formatted string literals are string literals beginning with f
or F
before the opening quotation mark. Anything between {
and }
inside the string will be parsed as variables or expressions.
year = 2022
f'This year is {year}. The next year is {year+1}.'
# => 'This year is 2022. The next year is 2023.'
The grammar for the replacement field:
replacement_field ::= "{" f_expression ["="] ["!" conversion] [":" format_spec] "}"
conversion
is used to convert the value before it is formatted:
!a
appliesascii()
!s
appliesstr()
!r
appliesrepr()
The grammar for the format specifications is as follows. For more detail see the document.
format_spec ::= [[fill]align][sign][#][0][width][grouping_option][.precision][type]
fill ::= <any character>
align ::= "<" | ">" | "=" | "^"
sign ::= "+" | "-" | " "
width ::= digit+
grouping_option ::= "_" | ","
precision ::= digit+
type ::= "b" | "c" | "d" | "e" | "E" | "f" | "F" | "g" | "G" | "n" | "o" | "s" | "x" | "X" | "%"
The str.format()
Method
We can still use {
and }
to mark where a variable is inserted, but we’ll need to provide the information manually. Basic usage:
print('We are the {} who say "{}!"'.format('knights', 'Ni'))
# => We are the knights who say "Ni!"
Numbers inside brackets refer to the position of the object passed into str.format()
:
print('{1} and {0}'.format('spam', 'eggs'))
# => eggs and spam
Keyword arguments can be used:
print('This {food} is {adjective}.'.format(
food='spam', adjective='absolutely horrible'))
# => This spam is absolutely horrible.
We can pass a dictionary and access the keys using []
inside the replacement field.
table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
print('Jack: {0[Jack]:d}; Sjoerd: {0[Sjoerd]:d}; '
'Dcab: {0[Dcab]:d}'.format(table))
# => Jack: 4098; Sjoerd: 4127; Dcab: 8637678
The format spec can still be used after :
.
Manual String Formatting
We can manually format the output string using string concatenation and formatting methods:
str.ljust()
,str.center()
andstr.rjust()
are used for justifying strings.str.zfill()
pads a numeric string on the left with zeros.
Old String Formatting
There is a printf-style string formatting operator %
.
import math
print('The value of pi is approximately %5.3f.' % math.pi)
# => The value of pi is approximately 3.142.
Reading and Writing Files
Text mode and binary mode (append b
to the open mode) matter, because in text mode, platform-specific line endings will be converted to \n
when reading, and \n
is converted back to platform-specific line endings when writing. This will corrupt binary data.
Use with
statement or f.close()
, or system resources are leaked and the file might not be completely written to the disk before the program exits.
with open('workfile') as f:
read_data = f.read()
f.closed # => True
Methods for File Objects
f.read(size)
is for reading a file’s contents. At most size
characters (in text mode) or size
bytes (in binary mode) are read. If the end of file is reached, the function returns ''
.
f.readline()
reads a single line from the file. The newline character is kept in the string returned. This helps to distinguish an empty line (\n
) and the end of file (''
).
To read lines from a file, we can also loop over the file object:
for line in f:
print(line, end='')
f.write(string)
writes the contents of string
to the file, returning the number of characters written. Other objects need to be converted to a string (in text mode) or a bytes object (in binary mode).
f.tell()
returns an integer indicating the file object’s current position. In binary mode, this represents the number of bytes from the beginning. In text mode, this is an opaque number.
f.seek(offset, whence)
is used to change the file object’s position.
whence
indicates the reference point:- 0 means the beginning of the file (default)
- 1 means the current position
- 2 means the end of the file
offset
is added to the reference point to calculate the position
In text mode, only seeks relative to the beginning is allowed (except for seeking the very file end with seek(0, 2)
) and the only valid offset
values are those returned from the f.tell()
, or zero.
Saving Structured Data with json
Four basic functions to read and write JSON easily:
import json
x = [1, 'simple', 'list']
string = json.dumps(x) # => '[1, "simple", "list"]'
obj = json.loads(string)
with open('file.json', 'w') as f:
json.dump(x, f)
with open('file.json', 'r') as f:
obj_from_file = json.load(f)
pickle
is another module for serializing Python objects. It is specific to Python and cannot be used to communicate with other languages. It is also insecure if the data to be deserialized is untrusted.
8. Errors and Exceptions
The try
Statement
The except
clause may contain multiple exceptions as a tuple:
while True:
try:
x = int(input("Please input a number: "))
break
except (RuntimeError, TypeError, ValueError):
print("Oops! Try again...")
An class in an except
clause is caught if it is the same class or a base class of the exception.
There is an optional else
clause which is executed if the try
clause does not raise an exception. It must follow all except
clauses.
There is another optional finally
clause which will execute before the whole try
statement completes, whether or not the try
statement produces an exception.
- If an exception occurs during execution of the
try
clause, the exception may be handled by anexcept
clause. If the exception is not handled by anexcept
clause, the exception is re-raised after thefinally
clause has been executed. - An exception could occur during execution of an
except
orelse
clause. Again, the exception is re-raised after thefinally
clause has been executed. - If the
finally
clause executes a break, continue or return statement, exceptions are not re-raised. - If the
try
statement reaches a break, continue or return statement, thefinally
clause will execute just prior to the break, continue or return statement’s execution. - If a
finally
clause includes a return statement, the returned value will be the one from thefinally
clause’s return statement, not the value from thetry
clause’s return statement.
Handling Exceptions
The except
clause may specify a variable after the exception name. The variable is bound to an exception instance with the arguments stored in instance.args
.
try:
raise Exception('spam', 'eggs')
except Exception as inst:
print(inst.args)
print(inst)
Result:
('spam', 'eggs')
('spam', 'eggs') => __str__ allows args to be printed directly,
but this can be overridden in exception subclasses
Exception Chaining
The raise
statement allows an optional from
which enables chaining exceptions.
def func():
raise ConnectionError
try:
func()
except ConnectionError as exc:
raise RuntimeError('Failed to open database') from exc
Output:
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "<stdin>", line 2, in func
ConnectionError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<stdin>", line 4, in <module>
RuntimeError: Failed to open database
Exception chaining happens automatically when an exception is raised inside an except
or finally
section. This can be disabled by using from None
.
9. Classes
In C++ terminology, class members in Python are always public, and all member functions are virtual. Classes are themselves objects.
A Word About Names and Objects
Objects have individuality, and multiple names (in multiple scopes) can be bound to the same object. Think of an object as a physical object, and names are labels attached to an object.
Python Scopes and Namespaces
A namespace is a mapping from names to objects. Internally they are implemented as Python dictionaries. Examples of namespaces are the set of built-in names, the global names in a module, the local names in a function invocation and the set of attributes of an object. An attribute is any name following a dot, for example, obj.attr
or module_name.func
.
Writable attributes can be deleted with the del
statement.
The statements executed by the top-level invocation of the interpreter are considered part of a module called __main__
. Therefore, we have the idiom:
if __name__ == '__main__':
print('This will be printed only if this file is directly '
'called by the interpreter')
A scope is a textual region of a Python program where a namespace is directly accessible. There are usually 3 or 4 nested scopes:
- The innermost scope containing the local names. This is searched first.
- The scopes of any enclosing functions, containing non-local, but also non-global names. These are searched starting with the nearest enclosing scope.
- The next-to-last scope containing the current module’s global names.
- The outermost scope containing built-in names.
Assignment to names always go into the innermost scope, except when global
or nonlocal
is used. They indicate that particular variables live in the global scope or an enclosing scope, and should be rebound there.
A First Look at Classes
Class Definition Syntax
When a class definition is entered, a new namespace is created, and uses the local scope. When a class definition is left, a class object is created, the original scope is reinstated, and the class object is bound to the class name.
Class Objects and Instance Objects
Class objects support attribute references and instantiation. For example:
class MyClass:
"""A simple class"""
i = 12345
def __init__(self):
self.data = []
def f(self):
return 'hello world'
# Attribute references -- data
MyClass.i # => 12345
MyClass.__doc__ # => 'A simple class'
# Instantiation creates a new instance object
x = MyClass()
# Attribute references -- functions
MyClass.f(x) # => 'hello world'
When a class defines an __init__()
function, class instantiation automatically invokes __init__()
for the newly-created class instance.
In the __init__()
function above, a data attribute data
is added to the new class instance. Data attributes need not be declared. They are created when they are first assigned to.
Method Objects
In Python, functions are objects. E.g., MyClass.f
is a function object. However, x.f
is a method object, not a function object. This method object is created by packing the instance object (x
) and the function object (MyClass.f
) together. When the method object is called with an argument list, a new argument list is constructed by prepending the instance object to the argument list, and the function object is called with this new argument list. Therefore:
x.f()
# is equivalent to
MyClass.f(x)
This is why there is a convention that the first parameter of a method is called self
.
Class and Instance Variables
Instance variables are for data unique to each instance and class variables are for attributes and methods shared by all instances of the class.
If the same attribute name occurs in both an instance and in a class, then attribute lookup prioritizes the instance:
class MyClass:
i = 1
x = MyClass()
y = MyClass()
x.i # => 1 (class variable)
x.i = 2
x.i # => 2 (instance variable)
y.i # => 1 (class variable)
There is nothing in Python that enforces data hiding — it is all based on convention.
Inheritance
The syntax for a derived class:
class DerivedClassName(BaseClassName):
pass
All methods in Python are effectively virtual.
To check inheritance, isinstance(obj, class_name)
is True
only if obj.__class__
is class_name
or some class derived from it; issubclass(class1, class2)
is True
only if class1
is class2
or a subclass of class2
.
Python supports multiple inheritance.
class DerivedClassName(Base1, Base2, Base3):
pass
For most purposes, the search for attributes inherited from a parent is depth-first, left-to-right, not searching twice in the same class where there is an overlap in the hierarchy.
Private Variables
There is a convention that a name prefixed with an underscore (e.g. _spam
) should be treated as private.
There is a mechanism called name mangling that supports class-private members (to avoid name clashes of names with names defined by subclasses). Any identifier defined in a class of the form __spam
(at least two leading underscores, at most one trailing underscore) is replaced with _classname__spam
, where classname
is the current class name with leading underscore(s) stripped. For example:
class Mapping:
def __init__(self, iterable):
self.items_list = []
self.__update(iterable)
def update(self, iterable):
for item in iterable:
self.items_list.append(item)
# private copy of original update() method
__update = update # replaced by _Mapping__update
class MappingSubclass(Mapping):
def update(self, keys, values): # provides new signature for update()
for item in zip(keys, values):
self.items_list.append(item)
__update = update # replaced by _MappingSubclass__update
If update()
were directly used in Mapping.__init__()
, the update()
function in the subclass will be used when creating an instance of the subclass.
Iterators
A for
statement works with an iterable by calling iter()
on the iterable. The function returns an iterator object that defines the method __next__()
which accesses elements in the container one at a time. When there are no more elements, __next__()
raises a StopIteration
exception which tells the for
loop to terminate. We can call __next__()
with the builtin function next()
:
s = 'abc'
it = iter(s)
while True:
try:
print(next(it))
except StopIteration:
break
To support iterator behaviour in a user-defined class, define a __iter__()
method which returns an object with a __next__()
method. If the class defines __next__()
itself, then __iter__()
can just return self
.
Generators
Generators are a simple way of creating iterators. They are written like regular functions but use yield
when they want to return data. Each time next()
is called on it, the generator resumes where it left off.
def reverse(data):
for index in range(len(data)-1, -1, -1):
yield data[index]
for char in reverse('golf'):
print(char) # 'f', 'l', 'o', 'g'
__iter__()
and__next__()
are automatically created, andStopIteration
is raised when the generator terminates.- Local variables and execution state are automatically saved between generator calls.
留下评论
注意 评论系统在中国大陆加载不稳定。