`
cutesunshineriver
  • 浏览: 195577 次
  • 性别: Icon_minigender_1
  • 来自: 广州
社区版块
存档分类
最新评论

Life is short, use python.

阅读更多
前一阵子看书《Head First Python》学了一下Python,最近在实践中又小试了一把,某些场景用起来确实很爽。

我有一个100M的CSV文件,记录数在120万行左右。
单条记录如下:
211.136.115.11|www.baidu.com|20130228105956|119.75.218.77;119.75.217.56|0
我想找出这个文件里排名前10的域名。
下面是华丽丽的Python代码:
# coding=utf-8
import csv
import datetime

filename = "E:/220_01_20130228100039.txt"
rownum = 0  # 文件行号
rank = 10  # 排名数
result = dict()

start = datetime.datetime.now()
with open(filename, 'rU', encoding="utf-8", errors='ignore') as csvfile:
    reader = csv.reader(csvfile, delimiter="|")
    for line in reader:
        rownum += 1
        
        if len(line) < 1:  # 空行
            continue
        elif len(line) < 2:
            print("错误行号:" + str(rownum))
            continue
        elif result.get(line[1], None) is None:
            result[line[1]] = 1
        else:
            result[line[1]] += 1
print("~~~~~~~~~~~~~~~~~~~~~~~~")
print("域名数:" + str(len(result)))
count = rank
result = sorted(result.items(), key=lambda d:d[1], reverse=True)
for item in result:
    count -= 1
    print("排名" + str(rank - count) + ":" + str(item))
    if count == 0:
        break;
end = datetime.datetime.now()
print("耗时:" + str(end - start))

35行代码搞定,如果换成java,估摸着要60行上下了。

从5万多个域名里找出前十,3个是QQ域名,2个是Google域名,2个是baidu域名,1个是apple域名,1个是新浪微博,还有1个居然是移动广告的flurry。

哪天有时间,可以尝试用shell来写。
分享到:
评论

相关推荐

    Python学习【变量和字符串】

    Life is short,you need Pyhton . 人生苦短,我用python。 Python 是一个高层次的结合了解释性、编译性、互动性和面向对象的脚本语言。 Python 的设计具有很强的可读性,相比其他语言经常使用英文关键字,其他语言...

    Making Use of Python (2002).pdf

    Python. Python is a portable, interpreted, object-oriented programming language. It com- bines remarkable power with very clear syntax. Moreover, its high-level built-in data structures, combined ...

    Python GUI Programming Cookbook

    If you are new to object-oriented programming (OOP), this book will teach you how to take advantage of the OOP coding style in the context of creating GUIs written in Python. Throughout the book, ...

    matlab集成c代码-Shaw:萧氏

    python. 推荐使用 Anaconda,这个IDE集成了大部分常用的包。 笔记内容使用 ipython notebook 来展示。 安装好 Python 和相应的包之后,可以在命令行下输入: $ ipython notebook 来进入 ipython notebook。 基本环境...

    Beginning Python (2005).pdf

    How Much Python Should You Use? 386 02_596543 ftoc.qxd 6/29/05 10:55 PM Page xxii xxiii Contents Pure Python Licensing 387 Web Services Are Your Friend 388 Pricing Strategies 389 Watermarking 390...

    How to Program Computer Science Concepts and Python Exercises pdf

    How to Program teaches you one of the world’s most accessible and powerful computer languages, Python. Learning a new language opens a wealth of opportunities. But there’s one language family that...

    Python: The Ultimate Python Quickstart Guide - From Beginner To Expert [2016]

    Knowing the importance of Python in today’s corporate world and job market, and knows exactly how and where you will be able to use your newly found skills to shine in your life! Don’t think about ...

    Python中使用gzip模块压缩文件的简单教程

    content = 'Life is short.I use python' zbuf = StringIO.StringIO() zfile = gzip.GzipFile&#40;mode='wb', compresslevel=9, fileobj=zbuf&#41; zfile.write(content) zfile.close() 但其实有个快捷的封装,不用...

    python-can-do-this::soft_ice_cream:列出python可以完成的事情

    Life is short, use python 人生苦短,我用Python Content No. Title Type Author 1 FileIO 2 System Why 经常有人问我 问: 你平时自己做个项目或者小工具 答: Python 问:那你都用Python做过啥呢? 答:建站、...

    Python调用百度翻译API

    1、第8行:appid = '填写自己的' ...3、46行:query = 'Life is short, I use python',这是翻译的话 4、47行:result = start_translating(query,flag=1) # flag=1:英文翻译中文 ,flag=2:中文翻译英文

    解决python运行效率不高的问题

    不同的语言会有不同的侧重,python语言毫无疑问更在乎编码效率,life is short,we use python。 虽然使用python的编程人员都应该接受其运行效率低的事实,但python在越多越来的领域都有广泛应用,比如科学计算 、...

    Python-基础-入门 简介

    Python简介及入门 python为什么是python 选择自己喜欢的语言,这往往不容易,更多的是根据需求 ...Life is short, I use python! 简介 python介绍: 到官网自个看 有兴趣可以看看: 解释性语言+动态类型语言+强

    Google C++ Style Guide(Google C++编程规范)高清PDF

    If an inline function definition is short, with very little, if any, logic in it, you should put the code in your .h file. For example, accessors and mutators should certainly be inside a class ...

    Selenium.Testing.Tools.Cookbook.2nd.Edition.178439251

    Each recipe begins with a short introduction and key concepts along with illustrated examples of use cases, and ends with detailed but informative descriptions of the inner workings of the example. ...

    Selenium Testing Tools Cookbook 最新 原版

    Each recipe begins with a short introduction and key concepts along with illustrated examples of use cases, and ends with detailed but informative descriptions of the inner workings of the example.

Global site tag (gtag.js) - Google Analytics