2016年8月13日土曜日

開発環境

Think Python (Allen B. Downey (著)、 O'Reilly Media)のChapter 13.(Case Study: Data Structure Selection)のExercises 13-1.(No. 2916)を取り組んでみる。

Exercises 13-1.(No. 2916)

コード(Emacs)

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import requests
import string


def get_words(filename):
    words = []
    with open(filename) as f:
        for line in f:
            for ch in string.punctuation:
                line = line.replace(ch, ' ')
            words.extend(
                map(
                    lambda s: s.strip(
                        string.whitespace + string.punctuation
                    ).lower(),
                    line.split()
                )
            )
    return words


if __name__ == '__main__':
    html = requests.get('https://sitekamimura.blogspot.jp')
    filename = 'words.txt'
    with open(filename, 'w') as f:
        print(html.text, file=f)
    words = get_words(filename)
    print(words[:10])
    print(len(words))

入出力結果(Terminal, IPython)

$ ./sample1.py
['doctype', 'html', 'html', 'class', 'v2', 'dir', 'ltr', 'xmlns', 'http', 'www']
49388
$

0 コメント:

コメントを投稿