http://aegisknight.org/2009/05/visualizing-python-import-dependencies/
以urllib2为例:

python -c "print __import__('urllib2').urlopen('http://tinyurl.com/25485h').url"
来源。
不过这个方法没法提取Javascript跳转。哈哈。话也说回来如果真的是Javascript跳转,不生成一个完整的DOM,理论上来说都不是100%能够解析任意HTML的。
我也写过一个类似的,不过复杂了。手动提取的301/302返回头跳转。
python -m smtpd -n -c DebuggingServer 0.0.0.0:25
非常方便测试。前几天玩Gmail的时候发现的。
今天在Python的Official Planet和Unofficial Planet里都发生了一件大事,Snakebite发布了。
snakebite: the open network
welcome to the future of open source developmentSnakebite is a network that strives to provide developers of open source projects complete and unrestricted access to as many different platforms, operating systems, architectures, compilers, devices, databases, tools and applications that they may need in order to optimally develop their software.
密西根大学Trent Nelson做的。感觉是buildbot的一种虚拟化和网络化的衍生,简直可以叫buildbot farm了。哈哈。用作者的话说就是
a comprehensive distributed test and development network, with hardware and software contributions from major companies, that's going to be the envy of the open source world.
目前network支持Windows、FreeBSD、NetBSD、OpenBSD、DragonFlyBSD、Solaris、HP-UX、AIX、Tru64 UNIX、Linux,数据库支持SQL Server、Oracle和DB2,目前跑的项目有Python和Django。以后估计会开放给更多的项目。
可以说是开源社区一个标志性的里程碑哇?这下软件公司要彻底嫉妒Open Source World的软件开发模式和平台咯。
希望能集成一个IRC或者XMPP通知之类的,这下hourly build就更加即时啦。哈哈。如果是Google来包装这个东西,估计又要叫 云编译、云开发了。。。。。嘿嘿
注:snakebite和以前那个BitTorrent Seed Server不同。
好吧。。我又Out了。。
整理资料。准备备份一次。15MB的rar包准备传到Gmail。手工操作太不靠谱,于是就自己写了一段Python脚本来完成这个工作。原理是:登录我的126邮箱,向Gmail发送带附件的Email。选择126的原因是网易公司在全国的CDN都比较快。。。
import smtplib, email import os, sys import hashlib
def send_mail(send_from, send_to, subject, text, attachment_bytes, auth=(), send_server='localhost'): msg = email.MIMEMultipart.MIMEMultipart() msg['From'] = send_from msg['To'] = email.Utils.COMMASPACE.join(send_to) msg['Date'] = email.Utils.formatdate(localtime=True) msg['Subject'] = subject
msg.attach( email.MIMEText.MIMEText(text) )
part = email.MIMEBase.MIMEBase('application', 'octet-stream') part.set_payload( attachment_bytes ) email.Encoders.encode_base64(part) part.add_header('Content-Disposition', 'attachment; filename=%s' % subject) msg.attach(part)
smtp = smtplib.SMTP(send_server) smtp.login(*auth) smtp.sendmail(send_from, send_to, msg.as_string()) smtp.close()
for f in sys.argv[1:]: f_name = os.path.basename(f) print '+ Uploading ' + f_name f = open(f, 'rb').read() md5 = hashlib.md5() md5.update(f) md5 = md5.hexdigest() sha1 = hashlib.sha1() sha1.update(f) sha1 = sha1.hexdigest()
send_mail( 'aaaaaa@126.com', ['bbbbbb@gmail.com'], f_name, f_name + '\r\n' + 'MD5: ' + md5 + '\r\nSHA1: ' + sha1, f, ('aaaaaa@126.com', 'aaaaaapwd'), 'smtp.126.com' )
print 'Done.'
比较Out的原因是,写完这个script就发现n个现成的了:Backup to Email,Lifehacker上用Blat命令行实现,Javaeye上一个几乎一样功能的帖子GSend.py,Gmail Backup,Send To GMail(a.k.a. Gmailer),当然Gmail Drive Shell Extension也是很老的一个软件了。。令我感到安慰的是这些软件基本都达到了1MB以上的体积。。还有我的script可以算MD5。。。。不过没法自动切割文件。。
>>> import win32com.client
>>> EncryptedData = win32com.client.Dispatch('CAPICOM.EncryptedData')
>>> EncryptedData.Algorithm.KeyLength = 5
>>> EncryptedData.Algorithm.Name = 2 #DES
>>> EncryptedData.SetSecret('mypass')
>>> EncryptedData.Content = 'Hello world'
>>> s = EncryptedData.Encrypt()
>>> s
u'MGEGCSsGAQQBgjdYA6BUMFIGCisGAQQBgjdYAwGgRDBCAgMCAAECAmYBAgFABAjj\r\nk6mhbmNo7AQQPzxLV17fVCCYUGLD+nGfigQYQfGjLZxf4C6n7diHGlmP5T1ucS8a\r\nX4Vw\r\n'
>>> EncryptedData.Decrypt(s)
>>> EncryptedData.Content
u'Hello world'
我的id是est,那么 'est' 这个字符用ASCII十六进制表示就是 0x65 0x73 0x74
好。打开这个网址 http://pi.nersc.gov/,输入 657374 然后选择HEX,点击SEARCH
得到的结果是:
search string = "657374"
24-bit binary equivalent = 01100101 01110011 01110100search string found at binary index = 3267948124
binary pi : 1000100101100101011100110111010000000110000001010010110010000010
binary string: 011001010111001101110100
hex pi : 6b9c2950f9c2cb52f57ad8965737406052c82ec88ec38c58
hex string: 657374
这个结果表示什么意思呢?把pi用二进制表示的话,01100101 01110011 01110100 ('est'的ASCII表示)出现在第 32,6794,8124 这个位置。32亿。。。嘿嘿。
NERSC这个网站挺有意思的。而且是.gov的。万恶的资本主义gov网站,为什么不多放点领佳节又重阳导讲话上去呢?
In 1996, NERSC Chief Technologist David H. Bailey, together with Canadian mathematicians Peter Borwein and Simon Plouffe, found a new formula for pi. This formula permits one to calculate the n-th binary or hexadecimal digits of pi, without having to calculate any of the preceding n-1 digits. This formula was discovered by a computer, using Bailey's implementation of Ferguson's PSLQ algorithm. More recently (2001), Bailey and colleague Richard Crandall of Reed College have shown that the existence of this new formula has significant implications for the age-old question: Are the digits of pi random?
这个算法很牛B啊。可以算出PI的二进制第n位的结果,而不需要知道前面的n-1位。
为什么要用二进制呢?一开始我也奇怪。后来恍然大悟。二进制可以非常方便的转换为其他任意进制。而且检索、存储也方便,运算效率高。
而且二进制非常容易被感性的接受。例如PI的二进制表示前几位:
11.
00100100 00111111 01101010
10001000 10000101 10100011
00001000 11010011 00010011
00011001 10001010 00101110
00000011 01110000 01110011
01000100 10100100 00001001
00111000 00100010 00101001
........
把一个均分为4截的线段,最后一截,无限二等分,按顺序读二进制pi的每一位,如果是0就表示PI落在了二等分的前面半截,如果是1就表示PI落在了二等分的后面半截。无穷循环下去就可以得到PI的精确点。

当然可以用一个python函数实现上面的步骤。支持中文id。
def string_in_pi(s):
""" Get index of a string found in binary PI
http://pi.nersc.gov/
Return -1 if not found
"""
if type(s) == unicode:
s = s.encode('gbk') #for Chinese strings
else:
s = str(s)
h = ''.join( '%X' % ord(x) for x in s)
import urllib2, re
c = urllib2.urlopen('http://pi.nersc.gov/cgi-bin/pi.cgi?word=%s&format=hex' % h).read()
g = re.search(r'binary index = (\d+)', c)
if g:
return int(g.groups()[0])
else:
return -1
不一定所有字符都会出现在二进制PI的前40亿位里。3个以内ASCII字符出现的概率几乎是100%,4个ASCII字符出现的概率是60%,5个是0.36%,如果你的ID超过6个,那么出现的概率在0.001%以下了。当然PI是无穷浮点数,如果把查询范围扩充到40亿位以后,我猜任何字符出现的概率应该是100%
贴个很低俗的结果:

有人说在PI的第n位之后,会有一个完整的莎士比亚全集文本,我深信不疑。哈哈
以后可以告诉别人:登录XX网站的密码保存在二进制PI的第1279804452位。。。。。。
先解释一段东西:
Client\0Music\0\01\0{1} by {0} from {2}\0Artist\0Title\0Album\0\0\0
以 \0 分割,有这么几个变量:
Client,客户端名字。可以自己乱写
Music,也可以写Office或者Game。图标会不一样。如图:

1 表示启用。0表示不启用
{1} by {0} from {2}:格式。{0}表示Artist, {1}表示Title,{2}表示Album
Artist:自定义歌手名字
Title:自定义单曲名字
Album:自定义专辑名字
当然还有几个\0后面可能跟WMID,WMContentID,和MSNFormat几个值,不过没关系,我们暂时不管
理解这个神奇的字符串后,我们就可以用各种编程语言(C/C++, VB, AHK, AutoIt, Python)来DIY MSN正在播放音乐的状态信息了。
最终Python代码就是这样的:
import ctypes
WM_COPYDATA = 0x4A
hWnd = ctypes.windll.User32.FindWindowA('MsnMsgrUIManager', None)
s = u"est_client\\0Games\\01\\0测试一下\\0\\0\\0\\0\\0".encode('utf_16_le')
class MsnData(ctypes.Structure):
_fields_ = [("dwData", ctypes.c_int),
("cbData", ctypes.c_int),
("lpData", ctypes.c_char_p),
]
msndata = MsnData(0x547, 256, ctypes.c_char_p(s))
ctypes.windll.User32.SendMessageA(hWnd, WM_COPYDATA, 0, ctypes.byref(msndata))
est友情提示:千万不要用ctypes里的c_wchar_p!在C这种非高级语言+Python动态类型语言混合编程的时候,在C里面使用Unicode或者WideChar简直就是一团浆糊。全部用byte数组来表示吧。所有Unicode先编码成UTF8/UCS2再处理。加上C语言字符串是以0x00结尾的。真是乱到家了。
Windows内部Unicode编码是UCS2,也就是不带BOM的Little endian UTF-16编码。
MSN/WML研究完了。接下来弄GTalk。以后再弄Skype、Fetion等等。
本来想正规的发到yeeyan的,可惜偶不是那种严肃认真的淫。所以写到自己blog里乱扯了。
遗传算法(Genetic Algorithms),blah,blah,blah……反正知道现在 r/programming 里这个东东很火就行了。Will Larson给大家带来了一篇精彩的Genetic Algorithms: Cool Name & Damn Simple。前面吹牛的部分就略过了,直接进入正题。
N个数,使之加起来恰好为X
唉。麻烦了。这样说最直观:
5个数,使之加起来恰好为200
例子:
lst = [40,40,40,40,40]
lst = [50,50,50,25,25]
lst = [200,0,0,0,0]
individual这个东西理解起来就是 个体
>>> from random import randint
>>> def individual(length, min, max):
... 'Create a member of the population.'
... return [ randint(min,max) for x in xrange(length) ]
...
>>> individual(5,0,100)
[79, 0, 20, 47, 40]
>>> individual(5,0,100)
[64, 1, 25, 84, 87]
这个可以理解为 群体 哇?
>>> def population(count, length, min, max):
... """
... Create a number of individuals (i.e. a population).
...
... count: the number of individuals in the population
... length: the number of values per individual
... min: the min possible value in an individual's list of values
... max: the max possible value in an individual's list of values
...
... """
... return [ individual(length, min, max) for x in xrange(count) ]
...
>>> population(3,5,0,100)
[[51, 55, 73, 0, 80], [3, 47, 18, 65, 55], [17, 64, 77, 43, 48]]
这个叫 fitness 函数。物竞天择,适者生存。注意这个是针对individual的
>>> from operator import add
>>> def fitness(individual, target):
... """
... Determine the fitness of an individual. Lower is better.
...
... individual: the individual to evaluate
... target: the sum of numbers that individuals are aiming for
... """
... sum = reduce(add, individual, 0)
... return abs(target-sum)
...
>>> x = individual(5,0,100)
>>> fitness(x,300)
65
fitness越小越好
评分。和上面fitness function差不多,不过是针对 population 群体的
>>> def grade(pop, target):
... 'Find average fitness for a population.'
... summed = reduce(add, (fitness(x, target) for x in pop), 0)
... return summed / (len(pop) * 1.0)
...
>>> x = population(3,5,0,100)
>>> target = 300
>>> grade(x, target)
116
下面进入遗传算法的核心:进化。
>>> father = [1,2,3,4,5,6]
>>> mother = [10,20,30,40,50,60]
>>> child = father[:3] + mother[3:]
>>> child
[1,2,3,40,50,60]
>>> from random import random, randint
>>> chance_to_mutate = 0.01
>>> for i in population:
... if chance_to_mutate > random():
... place_to_modify = randint(0,len(i))
... i[place_to_modify] = randint(min(i), max(i))
...
最后综合得到的代码就是:
def evolve(pop, target, retain=0.2, random_select=0.05, mutate=0.01):
graded = [ (fitness(x, target), x) for x in pop]
graded = [ x[1] for x in sorted(graded)]
retain_length = int(len(graded)*retain)
parents = graded[:retain_length]
# randomly add other individuals to promote genetic diversity
for individual in graded[retain_length:]:
if random_select > random():
parents.append(individual)
# mutate some individuals
for individual in parents:
if mutate > random():
pos_to_mutate = randint(0, len(individual)-1)
# this mutation is not ideal, because it
# restricts the range of possible values,
# but the function is unaware of the min/max
# values used to create the individuals,
individual[pos_to_mutate] = randint(
min(individual), max(individual))
# crossover parents to create children
parents_length = len(parents)
desired_length = len(pop) - parents_length
children = []
while len(children) < desired_length: male = randint(0, parents_length-1) female = randint(0, parents_length-1) if male != female: male = parents[male] female = parents[female] half = len(male) / 2 child = male[:half] + female[half:] children.append(child) parents.extend(children) return parents
>>> target = 371
>>> p_count = 100
>>> i_length = 5
>>> i_min = 0
>>> i_max = 100
>>> p = population(p_count, i_length, i_min, i_max)
>>> fitness_history = [grade(p, target),]
>>> for i in xrange(100):
... p = evolve(p, target)
... fitness_history.append(grade(p, target))
...
>>> for datum in fitness_history:
... print datum
...
最后得到的结果是:
['76.06', '32.13', '26.34', '18.32', '15.08', '11.69', '14.05', '9.460', '4.950', '0.0', '0.0', '0.0', '0.0', '0.0', '0.800', '0.0', '0.239', '0.780', '0.0', '0.0', '0.0', '0.0', '1.48', '0.0', '0.0', '0.0', '0.0', '0.0', '0.0', '0.149', '0.239', '0.12', '0.0', '0.0', '0.0', '0.0', '0.0', '0.0', '0.0', '0.149', '0.0', '0.0', '0.0', '0.0', '0.0', '0.0', '4.200', '0.0', '2.049', '0.0', '0.200', '0.080', '0.0', '1.360', '0.0', '0.0', '0.0', '0.0', '1.399', '0.0', '0.0', '0.149', '1.389', '1.24', '0.0', '0.16', '0.0', '0.680', '0.0', '0.0', '1.78', '1.05', '0.0', '0.0', '0.0', '0.0', '1.860', '4.080', '3.009', '0.140', '0.0', '0.38', '0.0', '0.0', '0.0', '0.0', '0.0', '2.189', '0.0', '0.0', '3.200', '1.919', '0.0', '0.0', '4.950', '0.0', '0.0', '0.0', '0.0', '0.0', '0.0']
生存率是20%,变异率是1%,只花了9代就得到了比较完美的结果。
"""
# Example usage
from genetic import *
target = 371
p_count = 100
i_length = 6
i_min = 0
i_max = 100
p = population(p_count, i_length, i_min, i_max)
fitness_history = [grade(p, target),]
for i in xrange(100):
p = evolve(p, target)
fitness_history.append(grade(p, target))
for datum in fitness_history:
print datum
"""
from random import randint, random
from operator import add
def individual(length, min, max):
'Create a member of the population.'
return [ randint(min,max) for x in xrange(length) ]
def population(count, length, min, max):
"""
Create a number of individuals (i.e. a population).
count: the number of individuals in the population
length: the number of values per individual
min: the minimum possible value in an individual's list of values
max: the maximum possible value in an individual's list of values
"""
return [ individual(length, min, max) for x in xrange(count) ]
def fitness(individual, target):
"""
Determine the fitness of an individual. Higher is better.
individual: the individual to evaluate
target: the target number individuals are aiming for
"""
sum = reduce(add, individual, 0)
return abs(target-sum)
def grade(pop, target):
'Find average fitness for a population.'
summed = reduce(add, (fitness(x, target) for x in pop))
return summed / (len(pop) * 1.0)
def evolve(pop, target, retain=0.2, random_select=0.05, mutate=0.01):
graded = [ (fitness(x, target), x) for x in pop]
graded = [ x[1] for x in sorted(graded)]
retain_length = int(len(graded)*retain)
parents = graded[:retain_length]
# randomly add other individuals to
# promote genetic diversity
for individual in graded[retain_length:]:
if random_select > random():
parents.append(individual)
# mutate some individuals
for individual in parents:
if mutate > random():
pos_to_mutate = randint(0, len(individual)-1)
# this mutation is not ideal, because it
# restricts the range of possible values,
# but the function is unaware of the min/max
# values used to create the individuals,
individual[pos_to_mutate] = randint(
min(individual), max(individual))
# crossover parents to create children
parents_length = len(parents)
desired_length = len(pop) - parents_length
children = []
while len(children) < desired_length: male = randint(0, parents_length-1) female = randint(0, parents_length-1) if male != female: male = parents[male] female = parents[female] half = len(male) / 2 child = male[:half] + female[half:] children.append(child) parents.extend(children) return parents
print '10X98765432'[reduce(lambda x,y
+y, map(lambda x:int(x[0])*x[1], zip('51000018880101001'[0:17], [7, 9, 10, 5, 8, 4, 2, 1, 6, 3, 7, 9, 10, 5, 8, 4, 2]) )) % 11]
想生成一个合法的18位良民证用于注册防沉迷?不妨试试上面的one liner python script。其中51000018880101001前5位是地区,接下来是生日,最后三位随便弄,奇数男的偶数女的。最后一位由上面的python代码获得。有的漏洞还可以注册形如19880431这样的日期。呵呵。