签到
- 苹果/安卓/wp
- 苹果/安卓/wp
客户端
0.0

0.00

人大经济论坛 › 论坛 › 计量经济学与统计论坛五区 › 计量经济学与统计软件 › winbugs及其他软件专版 › Python Cookbook

CDA数据分析研究院

商业数据分析与大数据领航教育品牌



经管云课堂

经管/金融/财会/社科/名师公开课



学术培训

Stata 空间计量 SSCI Python

贵宾：通行论坛特权+数据库权限
+案例库+下载特权 VIP：论坛特权+更多下载次数
+ccerdata数据库+更高阅读权限+……

提升主题| 本版置顶| 关闭主题| 变更主题颜色| 抢沙发| 顶贴| 道具中心

楼主: gafciausa

2231 18

Python Cookbook [推广有奖]

11楼

Lisrelchen 发表于 2015-9-28 01:21:21 |只看作者 |坛友微信交流群

Filtering a String for a Set of Characters
Credit: Jürgen Hermann, Nick Perkins, Peter Cogolo
Problem
Given a set of characters to keep, you need to build a filtering function that, applied to any string s, returns a copy of s that contains only characters in the set.
Solution
The translate method of string objects is fast and handy for all tasks of this ilk. However, to call translate effectively to solve this recipe’s task, we must do some advance preparation. The first argument to translate is a translation table: in this recipe, we do not want to do any translation, so we must prepare a first argument that specifies “no translation”. The second argument to translate specifies which characters we want to delete: since the task here says that we’re given, instead, a set of characters to keep (i.e., to not delete), we must prepare a second argument that gives the set complement—deleting all characters we must not keep. A closure is the best way to do this advance preparation just once, obtaining a fast filtering function tailored to our exact needs:
import string
# Make a reusable string of all characters, which does double duty
# as a translation table specifying "no translation whatsoever"allchars = string.maketrans('', '')
def makefilter(keep):
""" Return a function that takes a string and returns a partial copy
of that string consisting of only the characters in 'keep'.
Note that `keep' must be a plain string.
"""
# Make a string of all characters that are not in 'keep': the "set
# complement" of keep, meaning the string of characters we must delete
delchars = allchars.translate(allchars, keep)
# Make and return the desired filtering function (as a closure)
def thefilter(s):
return s.translate(allchars, delchars)
return thefilter
if _ _name_ _ == '_ _main_ _':
just_vowels = makefilter('aeiouy')
print just_vowels('four score and seven years ago')
# emits: ouoeaeeyeaao
print just_vowels('tiger, tiger burning bright')
# emits: ieieuii

复制代码

回复

使用道具举报

12楼

Lisrelchen 发表于 2015-9-28 01:22:25 |只看作者 |坛友微信交流群

Checking Whether a String Is Text or Binary
Credit: Andrew Dalke
Problem
Python can use a plain string to hold either text or arbitrary bytes, and you need to determine (heuristically, of course: there can be no precise algorithm for this) which of the two cases holds for a certain string.
Solution
We can use the same heuristic criteria as Perl does, deeming a string binary if it contains any nulls or if more than 30% of its characters have the high bit set (i.e., codes greater than 126) or are strange control codes. We have to code this ourselves, but this also means we easily get to tweak the heuristics for special application needs:
from _ _future_ _ import division # ensure / does NOT truncate
import string
text_characters = "".join(map(chr, range(32, 127))) + "\n\r\t\b"
_null_trans = string.maketrans("", "")
def istext(s, text_characters=text_characters, threshold=0.30):
# if s contains any null, it's not text:
if "\0" in s:
return False
# an "empty" string is "text" (arbitrary but reasonable choice):
if not s:
return True
# Get the substring of s made up of non-text characters
t = s.translate(_null_trans, text_characters)
# s is 'text' if less than 30% of its characters are non-text ones:
return len(t)/len(s) <= threshold

复制代码

回复

使用道具举报

13楼

Lisrelchen 发表于 2015-9-28 01:27:18 |只看作者 |坛友微信交流群

Controlling Case
Credit: Luther Blissett
Problem
You need to convert a string from uppercase to lowercase, or vice versa.
Solution
That’s what the upper and lower methods of string objects are for. Each takes no arguments and returns a copy of the string in which each letter has been changed to upper- or lowercase, respectively.
big = little.upper( )
little = big.lower( )

复制代码

回复

使用道具举报

14楼

Lisrelchen 发表于 2015-9-28 01:29:33 |只看作者 |坛友微信交流群

Accessing Substrings
Credit: Alex Martelli
Problem
You want to access portions of a string. For example, you’ve read a fixed-width record and want to extract the record’s fields.
Solution
Slicing is great, but it only does one field at a time:
afield = theline[3:8]
If you need to think in terms of field lengths, struct.unpack may be appropriate. For example:
import struct
# Get a 5-byte string, skip 3, get two 8-byte strings, then all the rest:
baseformat = "5s 3x 8s 8s"
# by how many bytes does theline exceed the length implied by this
# base-format (24 bytes in this case, but struct.calcsize is general)
numremain = len(theline) - struct.calcsize(baseformat)
# complete the format with the appropriate 's' field, then unpack
format = "%s %ds" % (baseformat, numremain)
l, s1, s2, t = struct.unpack(format, theline)

复制代码

回复

使用道具举报

15楼

Lisrelchen 发表于 2015-9-28 01:30:55 |只看作者 |坛友微信交流群

Changing the Indentation of a Multiline String
Credit: Tom Good
Problem
You have a string made up of multiple lines, and you need to build another string from it, adding or removing leading spaces on each line so that the indentation of each line is some absolute number of spaces.
Solution
The methods of string objects are quite handy, and let us write a simple function to perform this task:
def reindent(s, numSpaces):
leading_space = numSpaces * ' '
lines = [ leading_space + line.strip( )
for line in s.splitlines( ) ]
return '\n'.join(lines)

复制代码

回复

使用道具举报

16楼

Lisrelchen 发表于 2015-9-28 01:32:18 |只看作者 |坛友微信交流群

Expanding and Compressing Tabs
Credit: Alex Martelli, David Ascher
Problem
You want to convert tabs in a string to the appropriate number of spaces, or vice versa.
Solution
Changing tabs to the appropriate number of spaces is a reasonably frequent task, easily accomplished with Python strings’ expandtabs method. Because strings are immutable, the method returns a new string object, a modified copy of the original one. However, it’s easy to rebind a string variable name from the original to the modified-copy value:
mystring = mystring.expandtabs( )
This doesn’t change the string object to which mystring originally referred, but it does rebind the name mystring to a newly created string object, a modified copy of mystring in which tabs are expanded into runs of spaces. expandtabs, by default, uses a tab length of 8; you can pass expandtabs an integer argument to use as the tab length.
Changing spaces into tabs is a rare and peculiar need. Compression, if that’s what you’re after, is far better performed in other ways, so Python doesn’t offer a built-in way to “unexpand” spaces into tabs. We can, of course, write our own function for the purpose. String processing tends to be fastest in a split/process/rejoin approach, rather than with repeated overall string transformations:
def unexpand(astring, tablen=8):
import re
# split into alternating space and non-space sequences
pieces = re.split(r'( +)', astring.expandtabs(tablen))
# keep track of the total length of the string so far
lensofar = 0
for i, piece in enumerate(pieces):
thislen = len(piece)
lensofar += thislen
if piece.isspace( ):
# change each space sequences into tabs+spaces
numblanks = lensofar % tablen
numtabs = (thislen-numblanks+tablen-1)/tablen
pieces[i] = '\t'*numtabs + ' '*numblanks
return ''.join(pieces)

复制代码

回复

使用道具举报

加关注串个门加好友发消息 0关注 463 粉丝巨擘 Nicolle 当前离线阅读权限 255 威望 16 级论坛币 12402328 个通用积分 1620.9215 学术水平 3305 点热心指数 3329 点信用等级 3095 点经验 477211 点帖子 23879 精华 91 在线时间 9878 小时注册时间 2005-4-23 最后登录 2022-3-6 雷达卡	17楼 Nicolle 发表于 2015-9-28 01:50:06 \|只看作者 \|坛友微信交流群提示: 作者被禁止或删除内容自动屏蔽

	回复使用道具举报显身卡

加关注串个门加好友发消息 0关注 463 粉丝巨擘 Nicolle 当前离线阅读权限 255 威望 16 级论坛币 12402328 个通用积分 1620.9215 学术水平 3305 点热心指数 3329 点信用等级 3095 点经验 477211 点帖子 23879 精华 91 在线时间 9878 小时注册时间 2005-4-23 最后登录 2022-3-6 雷达卡	18楼 Nicolle 发表于 2015-9-28 01:52:20 \|只看作者 \|坛友微信交流群提示: 作者被禁止或删除内容自动屏蔽

	回复使用道具举报显身卡

19楼

在职认证

发表于 2022-1-4 09:13:01 来自手机 |只看作者 |坛友微信交流群

非常好的书，谢谢分享！

回复

使用道具举报

发帖

本版微信群

加好友,备注jltj
拉您入交流群

如有投资本站、合作意向或投放广告，请联系：13661292478（刘老师）

联系客服

邮箱：service@pinggu.org 投诉或不良信息处理：（010-68466864）

京ICP备16021002-2号京B2-20170662号京公网安备 11010802022788号论坛法律顾问：王进律师知识产权保护声明免责及隐私声明