Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
f29ed27
Add files via upload
liupengyuan Nov 13, 2017
c08faaa
Create readme.md
liupengyuan Nov 13, 2017
a40c76b
Add files via upload
liupengyuan Nov 13, 2017
eec9e63
Add files via upload
liupengyuan Nov 13, 2017
2301fee
Update 8.md
liupengyuan Nov 14, 2017
d7df626
Update 8.md
liupengyuan Nov 14, 2017
57d9a7e
Update 6.md
liupengyuan Nov 14, 2017
bc274b8
Update 6.md
liupengyuan Nov 14, 2017
cfd3ba8
Update 6.md
liupengyuan Nov 14, 2017
43723be
Update 6.md
liupengyuan Nov 14, 2017
4cc7666
Create 1
zhushucheng Nov 15, 2017
2a0160a
Add files via upload
liupengyuan Nov 16, 2017
a084777
Merge pull request #621 from zhushucheng/master
liupengyuan Nov 17, 2017
950acab
Update 9.md
liupengyuan Nov 20, 2017
0975f96
Update 9.md
liupengyuan Nov 20, 2017
bb1d38c
Update 7.md
liupengyuan Nov 20, 2017
6027106
Update 6.md
liupengyuan Nov 20, 2017
8cf1d30
Add files via upload
liupengyuan Nov 23, 2017
3c9def6
Update python爬虫入门.ipynb
liupengyuan Nov 23, 2017
1fb421d
Update 7.md
liupengyuan Nov 23, 2017
9fccec3
Add files via upload
liupengyuan Nov 25, 2017
427b288
Add files via upload
liupengyuan Nov 25, 2017
a9327ad
Create readme.md
liupengyuan Nov 25, 2017
b3d24b4
Add files via upload
liupengyuan Nov 25, 2017
569d28d
Add files via upload
liupengyuan Nov 25, 2017
a509377
Add files via upload
liupengyuan Nov 25, 2017
3345006
Add files via upload
liupengyuan Nov 25, 2017
5c0b213
Add files via upload
liupengyuan Nov 25, 2017
e4afac8
Delete python正则表达式基础快速教程.ipynb
liupengyuan Nov 27, 2017
bfc8937
Create 基于电影评分的可视化分析
maluyaoMoMo Dec 10, 2017
0d3cae4
Merge pull request #622 from maluyaoMoMo/patch-1
liupengyuan Dec 11, 2017
1ebb443
Update 8.md
liupengyuan Dec 12, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 12 additions & 23 deletions chapter2/6.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,26 +18,14 @@ while i < n:

```python
# 正序输出单词示例2,假设确定会输入5个单词
word1 = None
word2 = None
word3 = None
word4 = None
word5 = None

i = 0
while i < 5:
word = input('请输入一个单词,回车结束')
if word1 == None:
word1 = word
elif word2 == None:
word2 = word
elif word3 == None:
word3 = word
elif word4 == None:
word4 = word
else:
word5 = word
i += 1

word1 = input('请输入一个单词,回车结束')
word2 = input('请输入一个单词,回车结束')
word3 = input('请输入一个单词,回车结束')
word4 = input('请输入一个单词,回车结束')
word5 = input('请输入一个单词,回车结束')


print(word5)
print(word4)
Expand Down Expand Up @@ -157,6 +145,7 @@ while i < n:
words.append(word)
i += 1

i = len(words) #事实上,这句话也可以省略,读者可自行分析原因,但不建议省略,影响程序可读性
while i > 0:
i -= 1
print(words[i])
Expand Down Expand Up @@ -410,8 +399,8 @@ for i in range(500):
outside_xs.append(x)
outside_ys.append(y)
# 画点
p.circle(inside_x, inside_y, size=3, color = 'red') # circle为画圆函数,x,y为坐标,size为大小,color为颜色
p.circle(outside_x, outside_y, size=3, color = 'blue')
p.circle(inside_xs, inside_ys, size=3, color = 'red') # circle为画圆函数,x,y为坐标,size为大小,color为颜色
p.circle(outside_xs, outside_ys, size=3, color = 'blue')

# 显示结果
show(p)
Expand Down Expand Up @@ -680,8 +669,8 @@ for i in range(10):

6.8 习题
- 将前面几章用while循环的习题,用for循环实现,并尽量写成函数。
- 写函数,返回一个list中所有数字的和
- 写函数,返回一个list中的最小值
- 写函数,返回一个list中的最大值,最小值,平均值。(不用内置的求和求函数),以[1,2,-1,55,100,899,-10,3,12.5,5.8]为例。
- 写函数,返回某个元素/对象在一个list中的位置,如果不在,则返回-1.
- 写函数,可将两个相同长度的list,间隔插入,生成新的list。例如:给两个list,a=[1,2,3,4], b=[5,6,7,8]。则可以生成:[1,5,2,6,3,7,4,8]。
- 写函数,可求两个向量的夹角余弦值,向量可放在list中。主程序调用该函数。
- 挑战性习题:python语言老师为了激励学生学python,自费买了100个完全相同的Macbook Pro,分给三个班级,每个班级至少分5个,用穷举法计算共有多少种分法?
4 changes: 2 additions & 2 deletions chapter2/7.md
Original file line number Diff line number Diff line change
Expand Up @@ -174,7 +174,7 @@ for i in range(x-1, -1, -1):
# 示例代码 9
line = '北京语言大学信息科学学院'
x = 4
print(line[0:x] + line[x-1:0:-1] + line[0])
print(line[0:x] + line[x-1::-1])   # print(line[0:x] + line[x-1:0:-1] + line[0])
```

本段代码中形如`序列[m:n:i]`的操作称为**序列切片**,就是从序列中取出索引在[m,n)之间,以i为间隔的所有对象,默认i为1。
Expand Down Expand Up @@ -260,7 +260,7 @@ print(numbers)
numbers = tuple()
print(numbers)

numbers = (1)
numbers = (1)   #数字而非元组
print(numbers)

numbers = (1,)
Expand Down
12 changes: 6 additions & 6 deletions chapter2/8.md
Original file line number Diff line number Diff line change
Expand Up @@ -240,7 +240,7 @@ def get_ch_table(line):
# 主程序
fh = open(r'd:\temp\idioms_correct.txt')
text = fh.read()
chs = get_ch_table(text.replace(r'\n', ''))
chs = get_ch_table(text.replace('\n', ''))

print(len(chs), chs)
```
Expand Down Expand Up @@ -278,7 +278,7 @@ idiom = '千钧一发' #假设抽取到了这个'成语'

fh = open(r'd:\temp\idioms_correct.txt')
text = fh.read()
chs = get_ch_table(text.replace(r'\n', ''))
chs = get_ch_table(text.replace('\n', ''))

guess_ch_table = [ch for ch in idiom]
while len(guess_ch_table) < 6:
Expand Down Expand Up @@ -319,7 +319,7 @@ poems = fh.read().split()
fh.close()

for guess in guesses:
if guess in poems:
if ''.join(guess) in poems:
print('答案是:' guess)
```

Expand All @@ -337,9 +337,9 @@ for guess in guesses:
```python
# 点字成诗机器人

def find_poem_sentence(poems, characters):
same_character_number = 0
def find_poem_sentence(poems, characters):
for poem in poems:
same_character_number = 0
for ch in poem:
if ch in characters:
same_character_number += 1
Expand Down Expand Up @@ -445,7 +445,7 @@ def idiom_robot(file_name):
text = fh.read()
idioms = text.split()
idiom = random.choice(idioms)
chs = get_ch_table(text.replace(r'\n', ''))
chs = get_ch_table(text.replace('\n', ''))

guess_ch_table = [ch for ch in idiom]
while len(guess_ch_table) < 6:
Expand Down
2 changes: 1 addition & 1 deletion chapter2/9.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@

9.1 抽取指定行文本作为实验语料

请读者下载文本文件:`http://pan.baidu.com/s/1c1RukqW 密码: 3c32`,并将文件解压,保存到目录:`d:\temp\`下,文件名为:`语料.txt`。解压时间较长,请耐心等待(好在只解压一次)。  
请读者下载文本文件:`http://pan.baidu.com/s/1c1RukqW 密码: 3c32`,并将文件解压,保存到目录:`d:\temp\`下,文件名为:`语料.txt`。解压时间较长,请耐心等待(好在只解压一次)。   (下载慢的话,可以临时下载:http://yunpan.blcu.edu.cn:80/link/FC61CB2B791A1999439CEC52C1A30CE2,作为小的试验文件,只有100多k)。
可用文本编辑器打开`语料.txt`来查看文件内数据格式,但是打开这个10G文本文件比较慢。我们可以进入到powershell下,键入:`Get-Content d:\temp\语料.txt -totalcount 10`,来查看文件的前10行。
对单个文本文件,如果比较大(如10G bytes),则可以考虑先取这个文件的前n(如n=5000)行另存为一个小文件,对这个小文件来进行统计,如果没有问题了,再对这个较大的文件进行处理。
当然,在大多数时候,我们面临的统计任务可能会是很多文件,可能会是较大规模的语料(几十G或者更多),则也最好不要直接对其进行编程操作,而是拷贝几个相同格式的语料到一个临时目录,对这个目录进行统计实验,这样不但执行起来快捷,而且如果程序有错误,也容易查找。
Expand Down
Loading