Python 代码片断

来自百合仙子's Wiki
跳转到导航 跳转到搜索

这里列出一些有用的但与其它模块关系不大的代码片断。与某模块相关的片断见模块自身的页面。

文本编码

改变标准 I/O 的编码

import sys
import io

def setup_io():
  sys.stdout = sys.__stdout__ = io.TextIOWrapper(
    sys.stdout.detach(), encoding='utf-8', line_buffering=True)
  sys.stderr = sys.__stderr__ = io.TextIOWrapper(
    sys.stderr.detach(), encoding='utf-8', line_buffering=True)

subprocess

以下实现可以在 ASCII 环境下,在调用子进程时使用 UTF-8 编码(I/O 和参数)。这个实现并不完整

import subprocess
import io

def run(cmd, stdout=subprocess.PIPE, universal_newlines=True, input=None):
  if universal_newlines and input:
    input = input.encode('utf-8')
  if isinstance(stdout, io.TextIOBase):
    stdout = stdout.buffer

  cmd = [x.encode('utf-8') for x in cmd]

  p = subprocess.run(
    cmd,
    input = input,
    stdout = stdout,
    stderr = subprocess.PIPE,
  )
  if universal_newlines:
    p.stdout = p.stdout.decode('utf-8')
    p.stderr = p.stderr.decode('utf-8')

  if p.returncode != 0:
    raise subprocess.CalledProcessError(
      p.returncode, cmd, p.stdout, p.stderr)
  return p

算法

第二代身份证校验位计算

#!/usr/bin/env python3
import sys; len(sys.argv) != 2 and (print('请给出一个至少 17 位数字的参数', file=sys.stderr) or sys.exit(1)) or len(sys.argv[1]) not in (17,18) and (print('参数错!', file=sys.stderr) or sys.exit(2)) or print('10X98765432'[sum(map(lambda x:int(x[0])*x[1], zip(sys.argv[1][:17], [7, 9, 10, 5, 8, 4, 2, 1, 6, 3, 7, 9, 10, 5, 8, 4, 2]))) % 11])

中文竖排

import sys
from itertools import zip_longest
print('\n'.join(map(lambda x: ''.join(reversed(x)), zip_longest(*sys.stdin.read().split('\n'), fillvalue=' '))))

函数

intersperse

该函数在 Haskell 中的定义为[1]

intersperse             :: a -> [a] -> [a]
intersperse _   []      = []
intersperse sep (x:xs)  = x : prependToAll sep xs

prependToAll            :: a -> [a] -> [a]
prependToAll _   []     = []
prependToAll sep (x:xs) = sep : x : prependToAll sep xs

测试于 Python 3.3.2。在 Python 2.7.3 中耗时较短,但相对速率一致。[2]

%timeit list(intersperse(' ', ['this','is','a','test']))

def intersperse(delimiter, iterable):
    it = iter(iterable)
    yield next(it)
    for x in it:
        yield delimiter
        yield x

100000 loops, best of 3: 5.46 us per loop

def intersperse(val, sequence):
    for i, item in enumerate(sequence):
        if i != 0:
            yield val
        yield item

100000 loops, best of 3: 6.95 us per loop

def intersperse(y, x):
    return chain(*zip_longest(x, [], fillvalue=y))

100000 loops, best of 3: 9.31 us per loop

数据结构

有序的带默认值的字典

class OrderedDefaultDict(defaultdict, OrderedDict):
  def __init__(self, default, *args, **kwargs):
    defaultdict.__init__(self, default)
    OrderedDict.__init__(self, *args, **kwargs)

外部命令

使用 file 命令获取 MIME 类型

import mimetypes

def guess_mime_using_file(path):
  result = subprocess.check_output(['file', '-i', path]).decode()
  _, mime, encoding = result.split()
  mime = mime.rstrip(';')
  encoding = encoding.split('=')[-1]
  return mime, encoding
mimetypes.guess_type = guess_mime_using_file

参考资料