计算概论:计算机文化、程序设计 - PKUnet.pku.edu.cn/~webg/book/IC/20150916-ComCu_1340.doc · Web viewIn reaction to their unpleasant Multics experience, a group of Bell

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

计算概论:计算机文化、程序设计

计 算 概 论

— 计算机文化、程序设计

Introduction to Computing: Computer Culture, and Programming

闫宏飞 陈翀 编著

by Hongfei Yan and Chong Chen

2015/9/16

内 容 简 介

本书主要是汇编各书和参考资料而成,比较系统地介绍了计算机文化,和程序设计。通过这两部分有机的结合(前者占1/3,后者占2/3),即理论与实践结合,使学生理解和掌握有关计算机和信息技术的基本概念和基本原理,对计算机学科有全局性的认识;学会使用计算机进行信息处理,熟练掌握Python语言编程技术,为后续相关课程的学习打好基础。本书层次分明,由浅入深,具有学习和实用双重意义。

本书可作为高等院校各专业一、二年级学生的教学参考书和技术资料,对广大从事计算机相关研究和应用开发的科技人员也有很大的参考价值。

从2015年秋季学期开始,程序设计部分讲授Python语言变成。所以去掉本书的下篇C++编程,有待更新。有兴趣C++编程的读者可以参阅之前的讲义。

前 言

《计算概论》是普通高校面向理工科低年级学生开设的计算机基础教育课。课程前1/3部分为计算机文化,后2/3部分为程序设计。

任教此课两年来,发现没有合适的教材,因此根据授课经验,汇编各书和参考资料,编成此书。

编者

2009年1月于北大燕园

目 录

前言

i计 算 概 论

1第1章 引论

21.1 计算机科学

31.2 摩尔定律

51.3 Scope of Problems

81.4 计算机科学有着主宰未来社会的强大威力

10上篇 计算机文化

11第2章 计算机系统

112.1 Computer Introduction

112.1.1 TURING MODEL

272.1.2 VON NEUMANN MODEL

292.1.3 Computer components

332.1.4 History

382.2 计算机系统漫游

392.2.1 Information is Bits + Context

412.2.2 Programs Are Translated by Other Programs into Different Forms

432.2.3 It Pays to Understand How Compilation Systems Work

442.2.4 Processors Read and Interpret Instructions Stored in Memory

502.2.5 Caches Matter

512.2.6 Storage Devices Form a Hierarchy

532.2.7 The Operating System Manages the Hardware

592.2.8 Systems Communicate With Other Systems Using Networks

612.2.9 Important Themes

662.2.10 Summary

69第3章 数据和数的表示

693.1 数据的表示

693.1.1数据的类型

703.1.2计算机内部的数据

713.1.3表示数据

783.1.4十六进制表示法

793.1.5八进制表示法

82第4章 程序设计语言和开发环境

824.1 程序设计语言

894.1.8 Practice set

924.2 开发环境(有待更新)

98下篇 程序设计

99第5章 C++基础

995.1 Python程序结构 (Structure of a program)

995.2变量和数据类型 (Variables and Data types )

995.3常量 (Constants )

100第6章 Variables: A Deeper Look

1006.1 Memory organization

1026.2 Variable scope

104第7章 算法

1057.1 The Role of Algorithms in Computing

1057.1.1 Algorithms

1107.1.2 Algorithms as a technology

1137.2 算法的概念

1137.3 算法的三种基本结构

1147.4 算法的表示

1147.5 介绍几种基本算法

1147.6 迭代与递归

115第8章 程序设计

1158.1 简单计算题

1158.2 模拟

1168.3 可模型化的问题

1168.4 动态规划

117Introduction (Beginner)

120Elementary

121Intermediate

123Upper-Intermediate

124Advanced

128参考文献

第1章 引论

计算机文化这个词的出现到被广泛认可的时间并无确切的考证,但基本上是在20世纪80年代后期。计算机开始是一种装置,进而到一门学科,再发展成为一种“文化”,它对人类的影响力之大的确令人惊叹。计算机文化是指能够理解计算机是什么,以及它如何被作为资源使用的。简单地说,计算机文化不但是知道如何使用计算机,更重要的是知道什么时候使用计算机。

在当今世界,几乎所有专业都与计算机息息相关。但是,只有某些特定职业和学科才会深入研究计算机本身的制造、编程和使用技术。用来诠释计算机学科内不同研究领域的各个学术名词的涵义不断发生变化,同时新学科也层出不穷。五个主要的计算机学科(disipline of computing)包括:

· 计算机工程学(Computer Engineering),是电子工程的一个分支,主要研究计算机软硬件和二者间的彼此联系。

· 计算机科学(Computer Science),是对计算机进行学术研究的传统称谓。主要研究计算技术和执行特定任务的高效算法。该门学科为我们解决确定一个问题在计算机领域内是否可解,如可解其效率如何,以及如何作成更加高效率的程序。时至今日,在计算机科学内已经派生了许多分支,每一个分支都针对不同类别的问题进行深入研究。

· 软件工程学(Software Engineering),着重于研究开发高质量软件系统的方法学和实践方式,并试图压缩并预测开发成本及开发周期。

· 信息系统(Information Systems),研究计算机在一个广泛的有组织环境中的应用。

· 信息技术(Information Technology),指计算机相关的管理和维护。

《计算概论》课程关注的是计算机学科。较大规模的致力于计算机科学的组织有:美国计算机协会(Association of Computing Machinery, 简称ACM);美国电气电子工程师协会(Institute of Electrical and Electronics Engineers,简称为IEEE)。

1.1 计算机科学

计算机科学是一门包含各种各样与计算和信息处理相关主题的系统学科,从抽象的算法分析、形式化语法等等,到更具体的主题如编程语言、程序设计、软件和硬件等。作为一门学科,它与数学、计算机程序设计、软件工程和计算机工程有显著的不同,却通常被混淆,尽管这些学科之间存在不同程度的交叉和覆盖。

计算机科学研究的课题是:

· 计算机程序能做什么和不能做什么(可计算性);

· 如何使程序更高效的执行特定任务(算法和复杂性理论);

· 程序如何存取不同类型的数据(数据结构和数据库);

· 程序如何显得更具有智能(人工智能);

· 人类如何与程序沟通(人机互动和人机界面)。

计算机科学的大部分研究是基于“冯·诺依曼计算机”和“图灵机”的,它们是绝大多数实际机器的计算模型。作为此模型的开山鼻祖,邱奇-图灵论题(Church-Turing Thesis)表明,尽管在计算的时间,空间效率上可能有所差异,现有的各种计算设备在计算的能力上是等同的。尽管这个理论通常被认为是计算机科学的基础,可是科学家也研究其它种类的机器,如在实际层面上的并行计算机和在理论层面上概率计算机、oracle 计算机和量子计算机。在这个意义上来讲,计算机只是一种计算的工具:著名的计算机科学家 Dijkstra 有一句名言“计算机科学之关注于计算机并不甚于天文学之关注于望远镜。”。

计算机科学根植于电子工程、数学和语言学,是科学、工程和艺术的结晶。它在20世纪最后的三十年间兴起成为一门独立的学科,并发展出自己的方法与术语。

早期,虽然英国的剑桥大学和其他大学已经开始教授计算机科学课程,但它只被视为数学或工程学的一个分支,并非独立的学科。剑桥大学声称有世界上第一个传授计算的资格。世界上第一个计算机科学系是由美国的普渡大学在1962年设立,第一个计算机学院于1980年由美国的东北大学设立。现在,多数大学都把计算机科学系列为独立的部门,一部分将它与工程系、应用数学系或其他学科联合。

计算机科学领域的最高荣誉是ACM设立的图灵奖,被誉为是计算机科学的诺贝尔奖。它的获得者都是本领域最为出色的科学家和先驱。华人中首获图灵奖的是姚期智博士。他于2000年以其对计算理论做出的诸多“根本性的、意义重大的”贡献而获得这一崇高荣誉。

1.2 摩尔定律

http://en.wikipedia.org/wiki/Moore%27s_Law

Moore's law describes a long-term trend in the history of computing hardware. Since the invention of the integrated circuit in 1958, the number of transistors that can be placed inexpensively on an integrated circuit has increased exponentially, doubling approximately every two years.The trend was first observed by Intel co-founder Gordon E. Moore in a 1965 paper.It has continued for almost half of a century and is not expected to stop for another decade at least and perhaps much longer.

图1-1 CPU Transistor Counts 1971-2008 & Moore’s Law, Growth of transistor counts for Intel processors (dots) and Moore's Law (logarithmic vertical scale)

Almost every measure of the capabilities of digital electronic devices is linked to Moore's law: processing speed, memory capacity, even the number and size of pixels in digital cameras.All of these are improving at (roughly) exponential rates as well.This has dramatically increased the usefulness of digital electronics in nearly every segment of the world economy. Moore's law describes this driving force of technological and social change in the late 20th and early 21st centuries.

http://baike.baidu.com/view/17904.htm

计算机第一定律——摩尔定律Moore定律。

归纳起来,主要有以下三种“版本”:

· 集成电路芯片上所集成的电路的数目,每隔18个月就翻一番。

· 微处理器的性能每隔18个月提高一倍,而价格下降一倍。

· 用一个美元所能买到的电脑性能,每隔18个月翻两番。

a

图1-2 Computer Speedup

Moore’s Law: “The density of transistors on a chip doubles every 18 months, for the same cost” (1965)

半导体集成电路的密度或容量每18个月翻一番

Moore's Law is still valid.  His law has nothing to do with the speed of the proccesor.  It has to do with the number of transitotrs which is still doubleing every couple of years.  Case in point there is now multiple cores in the same space instead of one core.

戈登·摩尔(Gordon Moore),CPU生产商Intel公司的创始人之一。1965年提出“摩尔定律”, 1968年创办Intel公司。摩尔1929年出生在美国加州的旧金山。曾获得加州大学伯克利分校的化学学士学位,并且在加州理工大学(CIT)获得物理和化学两个博士学位。50年代中期他和集成电路的发明者罗伯特·诺伊斯(Robert Noyce)一起,在威廉·肖克利半导体公司工作。后来,诺伊斯和摩尔等8人集体辞职创办了半导体工业史上有名的仙童半导体公司(Fairchild Semiconductor)。仙童成为现在的Intel和AMD之父。 1968年,摩尔和诺伊斯一起退出仙童公司,创办了Intel。Intel初期致力于开发当时计算机工业尚未开发的数据存储领域,后来,Intel进行战略转移,专攻微型计算机的“心脏”部件--CPU。

1.3 Scope of Problems

What can you do with 1 computer?

What can you do with 100 computers?

What can you do with an entire data center?

http://en.wikipedia.org/wiki/Distributed_computing#Projects

Projects:

A variety of distributed computing projects have grown up in recent years. Many are run on a volunteer basis, and involve users donating their unused computational power to work on interesting computational problems. Examples of such projects include the Stanford University Chemistry Department Folding@home project, which is focused on simulations of protein folding to find disease cures and to understand biophysical systems; World Community Grid, an effort to create the world's largest public computing grid to tackle scientific research projects that benefit humanity, run and funded by IBM; SETI@home, which is focused on analyzing radio-telescope data to find evidence of intelligent signals from space, hosted by the Space Sciences Laboratory at the University of California, Berkeley (the Berkeley Open Infrastructure for Network Computing (BOINC), was originally developed to support this project); LHC@home, which is used to help design and tune the Large Hadron Collider, hosted by CERN in Geneva; and distributed.net, which is focused on finding optimal Golomb rulers and breaking various cryptographic ciphers.

http://folding.stanford.edu/English/Main

http://zh.wikipedia.org/wiki/Folding@home

http://www.stanford.edu/group/pandegroup/images/FAH-May2008.png

http://www.equn.com/folding/

Folding@home是如何工作的呢?

Folding@home是一个研究研究蛋白质折叠,误折,聚合及由此引起的相关疾病的分布式计算工程。使用联网式的计算方式和大量的分布式计算能力来模拟蛋白质折叠的过程,并指引我们近期对由折叠引起的疾病的一系列研究。

图1-3 Folding@home

图1-4 Shrek © Dreamworks Animation, rendering multiple frames of high-quality animation

Happy Feet © Kingdom Feature Productions; Lord of the Rings © New Line Cinema

图1-5 Simulating several hundred or thousand characters

Indexing the web (Google)

Google(www.google.com)是一个搜索引擎,由两个斯坦福大学博士生Larry Page与Sergey Brin于1998年9月发明,Google Inc. 于1999年创立。Google网页搜索技术是来源于信息检索技术。Google的“网页快照”功能,能从Google服务器里直接取出缓存的网页。

Simulating an Internet-sized network for networking experiments (PlanetLab)

http://www.planet-lab.org/

PlanetLab is a global research network that supports the development of new network services. Since the beginning of 2003, more than 1,000 researchers at top academic institutions and industrial research labs have used PlanetLab to develop new technologies for distributed storage, network mapping, peer-to-peer systems, distributed hash tables, and query processing. PlanetLab currently consists of 1128 nodes at 511 sites.

Speeding up content delivery (Akamai)

美国Akamai是国际上最大的CDN服务商,它巨大的网络分发能力在峰值时可达到15Gbps。Akamai公司是为数不多的旨在消除Internet 瓶颈和提高下载速度的几家新公司之一,是一个致力于网络交通提速的“内容发布”公司,是波士顿高技术区最卓越的新兴企业之一。Akamai公司向全球企业 提供发送互联网内容,汇流媒体和应用程序的服务(目前,该公司为15个国家的企业管理着8000多台服务器)。1998年,丹尼尔.L和麻省理工学院的一些研究人员一起创立了这家公司,他在麻省理工学院的硕士论文构成了Akamai公司最初的“自由流”(Freeflow)技术的核心。

根据美国航空公司的消息,丹尼尔.L31岁,在2001年9月11日撞击纽约世界贸易中心的被劫持飞机上遇难。

1.4 计算机科学有着主宰未来社会的强大威力

《大数据,变革公共卫生》(英)维克托·迈尔-舍恩伯格,(英)肯尼思·库克耶

2009年出现了一种新的流感病毒。这种甲型H1N1流感结合了导致禽流感和猪流感的病毒的特点,在短短几周之内迅速传播开来。全球的公共卫生机构都担心一场致命的流行病即将来袭。有的评论家甚至警告说,可能会爆发大规模流感,类似于1918年在西班牙爆发的、影响了5亿人口并夺走了数千万人性命的大规模流感。更糟糕的是,我们还没有研发出对抗这种新型流感病毒的疫苗。公共卫生专家能做的只是减慢它传播的速度。但要做到这一点,他们必须先知道这种流感出现在哪里。

美国,和所有其他国家一样,都要求医生在发现新型流感病例时告知疾病控制与预防中心(CDC)。但由于人们可能患病多日实在受不了了才会去医院,同时这个信息传达回疾控中心也需要时间,因此,通告新流感病例时往往会有一两周的延迟。而且,疾控中心每周只进行一次数据汇总。然而,对于一种飞速传播的疾病,信息滞后两周的后果将是致命的。这种滞后导致公共卫生机构在疫情爆发的关键时期反而无所适从。

在甲型H1N1流感爆发的几周前,互联网巨头谷歌公司的工程师们在《自然》杂志上发表了一篇引人注目的论文。它令公共卫生官员们和计算机科学家们感到震惊。文中解释了谷歌为什么能够预测冬季流感的传播:不仅是全美范围的传播,而且可以具体到特定的地区和州。谷歌通过观察人们在网上的搜索记录来完成这个预测,而这种方法以前一直是被忽略的。谷歌保存了多年来所有的搜索记录,而且每天都会收到来自全球超过30亿条的搜索指令,如此庞大的数据资源足以支撑和帮助它完成这项工作。

发现能够通过人们在网上检索的词条辨别出其是否感染了流感后,谷歌公司把5000万条美国人最频繁检索的词条和美国疾控中心在2003年至2008年间季节性流感传播时期的数据进行了比较。其他公司也曾试图确定这些相关的词条,但是他们缺乏像谷歌公司一样庞大的数据资源、处理能力和统计技术。虽然谷歌公司的员工猜测,特定的检索词条是为了在网络上得到关于流感的信息,如“哪些是治疗咳嗽和发热的药物”,但是找出这些词条并不是重点,他们也不知道哪些词条更重要,更关键的是,他们建立的系统并不依赖于这样的语义理解。他们设立的这个系统唯一关注的就是特定检索词条的频繁使用与流感在时间和空间上的传播之间的联系。谷歌公司为了测试这些检索词条,总共处理了4.5亿个不同的数字模型。在将得出的预测与2007年、2008年美国疾控中心记录的实际流感病例进行对比后,谷歌公司发现,他们的软件发现了45条检索词条的组合,一旦将它们用于一个数学模型,他们的预测与官方数据的相关性高达97%。和疾控中心一样,他们也能判断出流感是从哪里传播出来的,而且他们的判断非常及时,不会像疾控中心一样要在流感爆发一两周之后才可以做到。

所以,2009年甲型H1N1流感爆发的时候,与习惯性滞后的官方数据相比,谷歌成为了一个更有效、更及时的指示标。公共卫生机构的官员获得了非常有价值的数据信息。惊人的是,谷歌公司的方法甚至不需要分发口腔试纸和联系医生——它是建立在大数据的基础之上的。这是当今社会所独有的一种新型能力:以一种前所未有的方式,通过对海量数据进行分析,获得有巨大价值的产品和服务,或深刻的洞见。基于这样的技术理念和数据储备,下一次流感来袭的时候,世界将会拥有一种更好的预测工具,以预防流感的传播。

数据重构商业,流量改写未来,旧思想渐渐消失,逐渐变成数据代码。

马云的一个交易平台,年成交量是1万亿,相当于17个省的GTP,导致以后得商铺租不出去,50%的书店,服装店,鞋店,精品店将倒闭!

中国联通和中国移动,沉睡难醒,怎么都不相信,一个马化腾,就可以在短短几个月内,一个微信软件的运用,差不多可以把这两个巨头在电话和短信的收费利用方面赶尽杀绝!

目前(2013年9月)阿里金融的众多业务已经基本具备了银行的“内核”。余额宝、阿里小贷、支付宝已经间接实现了银行三大核心业务“存、贷、汇”的功能。金融“搅局者”——阿里巴巴再次向人们展示了它的“破坏力”。近日,坊间传闻,阿里巴巴拟成立的阿里网络银行注册资本为10亿元,提供小微金融服务,业务范围包括存款、贷款、汇款;并称申请网络银行牌照的资料已递交证监会。更有传闻称,央行2013年9月6日开会听取了阿里巴巴集团关于筹建网商银行的设想汇报,阿里巴巴方面的回应称网商银行的筹建可能选址温州。

网络银行冲击传统实体银行。例如淘宝根据卖家记录,判断卖家信誉度,作为贷款的依据,从而提供对长尾分布中后80%人的贷款,这在传统银行业务中是很难实现的。

上篇 计算机文化

上篇的主要目的是向读者介绍有关计算机和信息技术的基本概念和基本原理,使读者能够对计算机学科有全局性的认识。

第2章 计算机系统

2.1 Computer Introduction

本节大部分内容取自下面这本书的第一章。等号线之间内容是编者加的。

Foundations of Computer Science,2e,by Behrouz Forouzan and Firouz Mosharraf, Cengage Learning Business Press, December 5, 2007

http://www.cengage.co.uk/forouzan/

The phrase computer science has a very broad meaning today. However, in this book, we define the phrase as "issues related to the computer". This introductory chapter first tries to find out what a computer is, and then investigates other issues directly related to computers. We look first at the Turing model as a mathematical and philosophical definition of computation. We then show how today's computers are based on the von Neumann model. The chapter ends with a brief history of this culture-changing device...the computer.

Objectives

After studying this chapter, the students should be able to:

· Define the Turing model of a computer.

· Define the von Neumann model of a computer.

· Describe the three components of a computer: hardware, data, and software.

· List topics related to computer hardware.

· List topics related to data.

· List topics related to software.

· Discuss some social and ethical issues related to the use of computers.

· Give a short history of computers.

2.1.1 TURING MODEL

The idea of a universal computational device was first described by Alan Turing in 1937. He proposed that all computation could be performed by a special kind of machine, now called a Turing machine. Although Turing presented a mathematical description of such a machine, he was more interested in the philosophical definiton of computation than in building the actual machine. He based the model on the actions that people perform when involved in computation. He abstracted these actions into a model for a computational machine that has really changed the world.

Perceptual knowledge (感性认识)

计算机组成部分

http://net.pku.edu.cn/~course/cs101/2008/video/computer_components.flv

Introduction to Computer Hardware

http://net.pku.edu.cn/~course/cs101/2008/video/intro2computer_hardware.flv

Install http://net.pku.edu.cn/~course/cs101/2008/video/flvplayer_setup.exe, if your computer can not show videos.

图2-1 Mother board (主板:集成多个部件、适配器,提供它们之间的互联)

主板(Main Board)又名主机板、系统板、母板,是PC机的核心部件。PC机的主板包括CPU、芯片组(Chipset)、高速缓存(Cache)、ROM_BIOS芯片、CMOS芯片、内存RAM、总线通道、软硬磁盘接口、串行和并行接口、USB接口、扩展槽(Slots)、直流电源插座、可充电电池以及各种条线。

图中从上到下,左到右:内存条,磁盘、光驱等的数据线接口;CPU风扇(一般下面是散热器,和CPU);棕色AGP槽:只能接显卡;白色PCI槽:能接显卡、网卡、声卡等。

图2-2 CPU = 运算器+控制器

图2-3 Alan Turing, founder of computer science, and artificial intelligence

http://www.builder.com.cn/2008/0331/788473.shtml

图灵是举世罕见的天才数学家和计算机科学家,仅仅在世42年。他的英年早逝,像他横溢的才华一样,令世界吃惊与难以置信。生命虽然短暂,但那传奇的人生,丰富多彩的创造力和智慧而深邃的思想,使他犹如一颗耀眼的明星,持续地照耀着人间后世在科学的浩瀚太空里探索未来的人们。

自上个世纪60年代以来,计算机技术飞速发展,信息产业逐渐成为影响人类社会的最重要的工业之一。支持技术与工业发展的理论基础是计算机科学。众所周知,“诺贝尔奖”是世界上最负盛名的奖项,但仅用于奖励那些在物理、化学、文学、医学、经济学与促进世界和平等方面做出开拓性重大贡献的人士。“图灵奖”则是计算机科学领域的最高奖项,有“计算机界诺贝尔奖”之称。设立这个大奖,既是为了促进计算机科学的进一步发展,也是为了纪念一位天才数学家、计算机科学的奠基人艾兰·图灵。

二战中英国的“超级机密”

http://news.163.com/07/0605/17/3G88T9VA00011T1U.html

英国记者安东尼·布朗写的《兵不厌诈》一书中,在叙述第二次世界大战英军与德军在北非战场决战时有这样一段话:“阿拉曼战役一开始,隆美尔军队失败的命运就注定了。……隆美尔所采取的每一个重大军事行动都被‘超级机密’暴露。隆美尔成了最没有希望的将军了。他给希特勒的一系列密电,蒙哥马利都通过‘超级机密’看到了……而希特勒发给隆美尔的复电,蒙哥马利有时甚至比隆美尔看到得还要早……”

神秘的布莱奇雷庄园

在伦敦郊外的一片绿树丛中,有一个神奇的庄园——布莱奇雷庄园。它是一幢维多利亚式建筑,但奇怪的是,在这座装饰华丽的大厦周围,还有不少小窝棚,看上去极不协调。这是一个什么地方呢?

原来,这是英国密码破译机构的所在地。那些小窝棚是因为破译工作量增大,庄园的房间容纳不下那么多人员和设备而仓促盖起来的。

在这座神秘的庄园里,聚集了众多的杰出人才。但这些人留着长发,衣冠不整,身着破破烂烂的花呢上衣,或是穿着皱皱巴巴的灯芯绒裤子,看上去行为又有些古怪。他们之中有的是数学家和语言学家,有的是国际象棋大师和方格字迹填写专家,也有的是电气工程师和无线电专家,更有银行职员和博物馆馆长。

这里是一个充满神秘色彩的地方,除了在这里工作的人员以外,只有英国国家首脑人物和最上层的情报官员才能到这里来。至于其他的人,无论职务再高,也“谢绝入内”。

这里工作人员的任务只有一个,就是利用先进的机器,破译德军发出的密码电报。后来,从这里发出的情报一律使用一个代号——“超级机密”。“超级机密”便是来自布莱奇雷庄园的情报。

正是来自布莱奇雷庄园的“超级机密”,使蒙哥马利在阿拉曼战役中大大受益,成为他的得力“助手”。

“埃尼格马”密码机

要了解“超级机密”的情况。还要从第二次世界大战开始的数年前德国纳粹使用的一种特殊密码说起。

纳粹在获取德国政权后,使用了一种不同于当时所有国家使用的新的军事密码,这种军事密码是由一台机器编制的。这台机器被恰如其分地称作“迷”(enigma, 译音为“埃尼格马”)。

1938年6月,英国情报六处的副处长孟席斯上校接到了他在东欧的一名特工吉布森少校的报告:一名拒绝说出自己真实姓名的波兰犹太人通过英国驻华沙大使馆同吉布森接触,声称他曾在德国首都柏林制造“埃尼格马”机器的秘密工厂当过技术员和理论工程师。后来因为他是犹太人,被驱逐出德国。现在,他提出可以凭自己的记忆为英国制造一部最新式的军用“埃尼格马”密码机,而作为报酬他要求一万英镑以及给他及其亲属颁发英国护照,并允许他们在法国居住。

孟席斯上校接到这个情报后,向英国情报当局作了报告。后来经过1个月的调查和甄别,英国情报局认为这个犹太人的话是可信的,因此决定答应他的条件。

于是这个犹太人被秘密转送到法国。英国情报人员为他安排了一个十分秘密的居住地点,并为他的复制密码机的工作提供了必要的条件。那人凭借自己的回忆,不久就复制出一台“埃尼格马”密码机。用英国密码分析局人员的话来说,“那是一部完美的密码机,是仿制工程的一个奇迹”。

仿制出来的“埃尼格马”密码机看上去很像一台老式办公用打字机。它的前部有一个普通的键盘,但是在上端真正打字机的键敲打的地方,则是闪现微光的另一个字母的扁平面。当操纵者触动键盘上的某个键时,譬如字母“A”,另一个不同的字母,譬如“P”便闪现在机器上端。操作时,密码员按动字母“A”键,电路沿弯曲的复杂线路一连穿过4个转子,然后撞击反射器,再沿不同的线路返回穿过转子线路,机器上便闪现出“P”字母。随着转子的变换,电子线路也随之完全改变,而改变转子或线路,就意味着产生一组组新的编码。

按照这种方法译成密码的电文,发给拥有同样一台机器的电报员后,对方把机器的转子和插头调整到像“发送”机器一样的位置,那么他只要打出密码,上述发报过程即可颠倒过来,从而准确地还原电文。

由于机器编码复杂,如果没有“埃尼格马”密码机,即便是最出色的数学家也需要进行很长时间研究才能破译,而此时,对于瞬息万变的战场来说,这种过时的情报价值已大打折扣了。

同时,“埃尼格马”密码机的调节程序十分复杂,并且经常变化,如果不了解变化无穷的调节程序,那么即使是拿到了机器也无济于事。

这个犹太人仿制的密码机,在刚开始的确帮了英国人的大忙。然而好景不长,仅仅一年以后,即到了1939年夏季,德国人又制造出了更加先进和复杂的密码机。这样,英国的情报人员又不得不想尽一切办法破解新的谜团。

正当英国情报人员被德国新式密码机所困扰时,波兰军事情报部门出于战略上的考虑,将他们数年工作的破译成果,以及仿制的样机转让给了英军情报部门。为了对付来自德国的威胁,波兰情报部门很早就开始对纳粹密码机的研究工作了,他们所取得的成果超过了英国。波兰人转让给英国的除了有“埃尼格马”样机外,还有可以确定密钥设置,解开其密码的“博姆”机。

波兰的“埃尼格马”样机和“博姆”机的图纸抵达英国不到一星期,德国军队便开过了波兰边界。消息传到布莱奇雷庄园,专家们默默无言。英国情报专家诺克斯缓步走到窗前,两眼湿润,喃喃自语道:“波兰,就像一名在倒下之前将自己的利剑递给盟友的武士,了不起啊!”

解译专家诺克斯与图林

英国情报人员在富于创造性的波兰人奠定的基础之上,向德国情报机构的机密发起了最后冲刺。由于两个关键人物的出色表现,加快了解开纳粹谜团的步伐。这两个人一个是诺克斯,另一个是图林。

诺克斯是一个又高又瘦的中年人,戴着一副高度近视的眼镜,他是个数学家。第一次世界大战中,他进入英国海军部的密码分析局,同其他学者一道,成功地破译了几乎所有德国当时的外交和军事密码。其中,颇具戏剧性的是德国三个字母的海军旗语密码,是他在一次洗澡时灵感偶发而破译的。第一次世界大战后,他留在了由英国外交部政府密码学校控制的密码分析局,当时几乎所有的英国密码破译人员都认为,诺克斯是世界上第一流的密码专家,是少见的密码破译奇才。

图林是诺克斯的助手,是一位身材矮胖结实的年轻人。图林毕业于英国剑桥大学,他在上学时所表现出来的数学天才,令校长和数学系的师生们十分惊叹。这个奇怪的年轻人经常有许多奇妙的设想和构思。他在进入英国政府的密码学校后,专门从事这方面的机械研制工作。在这里,他的天才得到了充分发挥。

经过诺克斯和图林的共同努力,一部“万能机器”终于研制成功了。这部两米多高,外形像一个老式钥匙孔的机器,实际上是一部最早的机械式数据处理机。使用它可以把“埃尼格马”的密码解密。随着越来越多的数据输入和使用人员经验的积累,使用这种机器解密的效率越来越高。

“超级机密”使英国赢得主动

1940年5月的一天,天空明净,阳光明媚。在大选中刚刚获胜不久的丘吉尔正在他的办公室忙碌着。这时,已经提升为情报六处处长的孟席斯走到首相的办公桌前,向他递交了一张纸条。

丘吉尔接过纸条扫了一眼,只见上边写着有关德国空军人员的调动和驻丹麦德军的补给分配等详情。这份情报价值不大,丘吉尔看后就随手将它扔到了桌上。

但是,当首相抬起头来看到站在他面前的孟席斯时,突然意识到了什么。他重新拿起情报仔细看着,然后抬头问道:“是它?‘超级机密’?”

孟席斯微笑着站在那里,他什么话也没说。其实已无需回答什么了,他那一脸掩饰不住的喜悦早已说明了一切!

这张小小的纸条的意义非同寻常,它们正是布莱奇雷经过几年努力破译的第一批“埃尼格马”密码情报。从这一天起,“超级机密”就成为了丘吉尔及盟国在整个第二次世界大战中的一张王牌。

“超级机密”问世之时,也正是不列颠之战激战正酣之时。这次战役为它提供了展现威力的大舞台。当时,正在英格兰上空与德军奋战的英国皇家空军并不知道,“超级机密”就像一只无形的巨大手臂支撑着他们。常常是戈林刚刚下达命令,布莱奇雷便立刻截获并将其破译,传到皇家空军的战斗机指挥部。这样,在德国战机从法国基地起飞之前,英国空军指挥官就可以知道起飞飞机的数量和要轰炸的目标,从而有针对性地采取相应的防御措施。

在整个第二次世界大战期间,“超级机密”是英国一个最机密、最重要、最可靠的情报来源。为了保住这一情报渠道的安全,英国情报部门从一开始就采取了一系列极其严格的保密措施,布莱奇雷庄园是绝对机密的地方,除了战时内阁和军方少数几个决策人物外,没有人了解其中的内幕。战时内阁明确规定,“超级机密”情报只能口头向英军作战的指挥员传达,不得以任何文字形式出现在战场上,以防止德军缴获“超级机密”文件。除少数几个高级将领外,其他指挥官都不知道战争情报的来源,他们只是知道这是绝对可靠的情报。

另外,为了防止德军可能从英国对抗措施的有效程度上推断其密码已被破译,所有“超级机密”情报都伪装成来自其他渠道,如间谍,德国的叛徒、缴获的德军文件、纳粹人员的疏忽失密等。

在布莱奇雷庄园的数百名专家,是当之无愧的无名英雄。他们当中几乎没有职业军人,对军衔、职称和权力也很陌生。但是,他们凭着满腔的爱国热情,凭着对纳粹暴行的痛恨和对事业的献身精神,不仅在战时,甚至在战后30年中也未曾泄露一丝一毫有关“超级机密”的内幕。正如首相丘吉尔称赞的,他们是“下金蛋的鹅,从不咯咯地叫”。直到英国政府宣布“超级机密”保密期结束,他们才和人们讲起自己当年的事情。

1942年10月下旬,隆美尔的给养频频告急,如不能及时得到供应,他的部队将难以支撑下去。希特勒督促有关人员尽快派出军需船运送给养,并发电报通知了隆美尔。当隆美尔收到这封电报时,英国的布莱奇庄园已经把它破译了出来。

很明显,如果隆美尔得到这些军用物资,他就可能站住脚跟。所以,必须坚决阻止这些军需品的运送。然而德国人这次派出的5艘运输船沿不同航线行驶,而且海上大雾弥漫,如果这5艘船只都遭到袭击,“超级机密”就会有被暴露的危险。

10月26日午夜已过,负责监督“超级机密”保密程序的温特博瑟姆用保密电话向丘吉尔说明了他进退两难的处境:哪一个更重要?是击败隆美尔?还是保护“超级机密”?

丘吉尔踌躇了好半天也没有做出决断。直到最后,才下令击沉这些船只。这是在第二次大战中,丘吉尔甘愿冒“超级机密”被暴露的风险的几次不多的行动之一。

1小时之内,英国皇家空军就接到了炸沉这些运输舰的命令。27日天刚亮,20架英国轰炸机就分别从卢卡和哈勒法机场起飞,在托布鲁克沿海雾中追上了第一只运送给养的“普罗什比纳”号,这艘船护航严密。在战斗中,皇家空军的20架飞机损失了6架,但“普罗什比纳”号还是被击沉了。后来,皇家空军的飞机又在托布鲁克西北的雾中借助照明弹发现了“特里波里诺”号油轮。这只船也被击沉了。它的伙伴,另一只油轮“奥斯蒂亚”号于28日拂晓被鱼雷击沉。同一天拂晓,皇家空军的飞机在托布鲁克以北100公里的地方发现了“扎拉”号,也用鱼雷击沉了它。它的同伴“布里俄尼”号虽然勉强进入了托布鲁克,但在卸汽油之前也被美国飞机击沉了。

隆美尔得知此事后大发雷霆。恰恰在他的部队进入这次战役中最激烈的战斗时,英国人却一夜之间几乎把他的全部军需品报销了,这种高度的巧合使隆美尔对此事产生了深深的怀疑,于是他给德国本部发了一份长电,要求调查一切可能泄密的来源,搞清楚在海上有雾的情况下英国人到底是怎么发现这些运输船只的。但是,直到战争结束,德国人最终也没能弄明白问题究竟出在哪里。

--------

http://zhidao.baidu.com/link?url=1HK4__HXhXm1rsWwApxU_JRCg0tJBwTIJSLP376kXnW4aNsHbZUqGuULvA56ynmURmQl8MgFr7cx-xzFKzO6Q_

在众多的世界级科学奖励中,诺贝尔奖是最高级别的奖,它为科学家所带来的荣誉可谓至高无上。但是100多年来人们不断要问,为什么没有设立诺贝尔数学奖?答案简单明了,即诺贝尔在他的遗嘱中决定的奖励是授予在物理、化学、生理学或医学领域作出最重要发现的科学家;另外,授予写出优秀文学作品的作者以及对世界和平事业作出杰出贡献的人。答案虽简单,但是是什么让诺贝尔作出决定不奖励数学家却也似乎成了一个难解的数学难题。

史学家们现在越来越多地相信这样一种事实,即诺贝尔忽视数学是受他所处的时代和他的科学观的影响。诺贝尔16岁的时候就终止了公立中学的教育,也没有继续上大学,之后只是从一位优秀的俄罗斯有机化学家Zinin那里接受了一些私人教育。事实上,正是Zinin在1855年把诺贝尔的注意力引向硝酸甘油。诺贝尔不愧是一位19世纪典型的、极赋天才的发明家,他的发明似乎更多地来自于其敏锐的直觉和非凡的创造力,而不需要借助任何高等数学的知识,其数学知识可能还不超过四则运算和比例率。而那时,也就是19世纪的下半世纪,化学领域的研究也一般不需要高等数学,数学在化学中的应用发生在诺贝尔去世以后。诺贝尔本人根本无法预见或想像到数学在推动科学发展上所起到的巨大作用,因此忽视了设立诺贝尔数学奖也不难理解。

另有国外学者认为这件事可能与诺贝尔的爱情受挫有关,诺贝尔有一个比他小13岁的女友,维也纳妇女Sophie Hess,后来诺贝尔发现她和一位数学家私下交往甚密。对于他的女友和那位数学家私奔一事诺贝尔一直耿耿于怀,直到生命的尽头诺贝尔还是个单身汉。也可能正是这件事让诺贝尔在叙述“诺贝尔基金会奖励章程”时把数学排除在外。

虽然没有人知道诺贝尔没有设立诺贝尔数学奖的确切原因,但不可否认的是,尽管没有诺贝尔数学奖,但20世纪以来数学研究和发展的脚步从未停歇过。

细数得过诺贝尔奖的“中国人”!谁说莫言是中国得诺奖第一人?

http://hi.baidu.com/world/item/01723b3e55141ec16c15e96a

“美籍华人”是个极其滑稽的词汇。中国人创造出如此滑稽的词汇,概因自己内心深处极度自卑的心理。照此逻辑,如今的美国人统统是“美籍英人”、“美籍法人”、“美籍爱尔兰人”、“美籍越南人”…………

“美籍华人”用的最多的地方,可能是诺贝尔奖。那么,咱们来梳理一下,到底哪些既是“华人”、又是“中国人”得过诺贝尔奖。

1,先说几个“伪中国人”

丁肇中(1976年物理学奖),在美国出生,自动获得美国国籍。

李远哲(1986年化学奖),1974年加入美国国籍,1994年放弃美国国籍,获奖的时候是美国人。

朱棣文(1997年物理学奖),在美国出生,自动获得美国国籍。

崔琦(1998年物理学奖),加入美国国籍时间不祥,但肯定早于1998年。

对于这4个人,诺贝尔奖主页上的介绍写的是USA。别激动,中国人不要自豪。

2,然后说一个“我不是中国人”

钱永健先生获得诺贝尔奖之后。大陆一些媒体纷纷强调“钱永健是钱学森的侄子”。那时我才知道,诺贝尔奖是要看重侄子关系的,那么,众多诺贝尔奖获得者的叔叔、大伯、阿姨、姑姑、舅舅、二大爷、三姥爷的国籍是否都应该强调一下呢?

某些爱国人士欢呼雀跃,以为钱永健是“我中华民族的骄傲”。可恨的是,钱永健这个家伙,完全继承了美国人民的诚实与直率,他很严肃认真地说“我不是中国人,我是美国人”,让中国的爱国人士非常失望,简直要说钱永健数典忘祖了。

3,再说一个“曾经是中国人”

高行健,凭借小说《灵山》获2000年诺贝尔文学奖,诺贝尔奖主页上的介绍写的是France。高先生1997年加入法国国籍。

高行健未能享受“美籍华人”待遇。他获奖消息传来,国内某些人不但不欢呼,反倒说高行健水平太差,不该获奖。真是奇怪。

搞得法国人都看不下去了,说你们中国人干嘛污蔑我们法国人。

注意,高行健获奖的是中文小说,中国人本应“自豪”一番的。

4,一个双重国籍的人

赛珍珠女士,凭借其小说《大地》(The Good Earth),获1938年诺贝尔文学奖。

当时她是美国、中国双重国籍。不过,赛珍珠在受奖演说中明确说“我也为我的祖国——美利坚合众国而受奖”,看来,这个“中国名额”咱是争不过来了。

仍然可以自豪一下。赛珍珠出生近4个月就来到中国,并在中国生活了近40年,可以说把自己最美好的年华都留在了中国。赛珍珠病逝后,按其遗愿,墓碑上只镌刻“赛珍珠”三个汉字。

对这样一位热爱中国的女士,我们本可自豪一番。但中国文学史很少提赛珍珠,如同我们曾经在中国文学史上抹杀胡适、林雨堂、梁实秋诸多人一样。

5,一个西藏人

1989年,14世达 LAI喇MA(丹增嘉措 他的名字经常过敏),获得诺贝尔和平奖。

先声明;我们坚决反对达和尚分裂国家的阴谋。然后,再来分析:达和尚1959年流亡到印度,成立了西藏流亡政府,但是印度只给这些流亡的藏民提供援助,并不让他们加入印度国籍,所以说达赖喇嘛的国籍一直没有更换过。

诺贝尔奖主页上的介绍写的是Tibet。西方人一直是把Tibet与China分列的。但是,按照中国人一贯的做法,西藏是中国的一部分,所以,达和尚是获得诺贝尔奖中国人。至于是否要为此自豪,那是另外一回事。

6, 两个台湾省的中国人

1957年,李政道杨振宁获得诺贝尔物理学奖。这两个人加入美国国籍的时间分别是1962年、1964年。他们俩获奖当时毫无疑问是中国人,诺贝尔奖主页上的介绍写的是China。

为什么一向喜欢自豪的中国人要故意放弃这次自豪的机会呢?概因李政道杨振宁的国籍是中华民国,是台湾培养出来的。如果当时以他们俩自豪,那就是长台湾志气,灭大陆威风。

7, 一个中国大陆的中国人

2012年,莫言获得诺贝尔文学奖!举国欢腾!

盼了这么多年,终于有一个根正苗红的了!

到此,我们梳理出一个结果了:我们承认台湾、西藏都是中国的一部分,我们就必须说;有4个中国人获得过诺贝尔奖。

--------

http://zh.wikipedia.org/wiki/图灵

图灵被视为计算机科学之父。1931年进入剑桥大学国王学院,毕业后到美国普林斯顿大学攻读博士学位,二战爆发后回到剑桥,后曾协助军方破解德国的著名密码系统Enigma,帮助盟军取得了二战的胜利。

图灵对于人工智能的发展有诸多贡献,例如图灵曾写过一篇名为《机器会思考吗?》(Can Machine Think?)的论文,其中提出了一种用于判定机器是否具有智能的试验方法,即图灵试验。至今,每年都有试验的比赛。

此外,图灵提出的著名的图灵机模型为现代计算机的逻辑工作方式奠定了基础。

http://net.pku.edu.cn/~course/cs101/2008/video/alan_turing.flv

A short video describing the life and unfortunate death of Alan Turing.

http://zh.wikipedia.org/wiki/姚期智

姚期智,美籍华人,计算机科学家,2000年图灵奖得主,是目前唯一一位获得此奖项的华人及亚洲人。目前是清华大学理论计算机科学研究中心教授。

因为对计算理论,包括伪随机数生成,密码学与通信复杂度的诸多贡献,美国计算机协会(ACM)决定把该年度的图灵奖授予他。

http://zhidao.baidu.com/question/580709714.html

2000年至2010年, 转眼十年。遥想21世纪的第一个春天,2000年度图灵奖被授予华人计算机科学家姚期智。这个消息如此振奋人心,有着“计算机世界的诺贝尔奖”之称的图灵奖,35年来首次迎来一位亚裔学者,姚期智也是首位获此殊荣的中国人。

姚期智生于上海长在台湾, 人生的前20年浸润在中国传统文化中。1967年,21岁的姚期智从台湾大学毕业后赴美国哈佛大学学习物理,并于1972年获得博士学位。姚期智深深热爱着科学研究,强烈的兴趣吸引着他津津乐道其中,他说:“我比较喜欢新奇的东西,有新的方向我就喜欢去看一看,试一试。”在加州大学作博士后研究时,姚期智发现,新兴的计算机科学有着主宰未来社会的强大威力,他敏锐地意识到这门十分重要的新兴学科具有巨大发展空间。于是这位年轻人作出了一生中的重要决定:放弃苦学8年的物理学,转而投向计算机科学研究。两年后,他顺利取得伊利诺伊大学计算机博士学位。

姚期智以敏锐的科学思维,不断探索新的学术领域,先后在麻省理工学院、斯坦福大学、加州大学伯克利分校等名校从事教学研究。1986年至2004年,他曾任普林斯顿大学计算机科学系教授,成为计算理论方面的顶尖科学家。姚期智是这样一位科学家——把艰苦工作视作无上乐趣。他认为科学研究具有独特的美感,迸发创造性观点时那“Happiest Moment”(最快乐的瞬间),是科学研究者最大的幸福。

他所发表的近百篇学术论文,革命性地改变了人们对“信息应如何有效地存储”的认识。姚期智作为国际上计算机理论方面最拔尖的学者,在伪随机数生成、密码学与通信复杂度等多个科研领域屡获殊荣。他是美国国家科学院院士、美国人文及科学院院士、中国科学院外籍院士及台湾中央研究院院士。他曾获得美国工业与应用数学学会乔治·波利亚奖,及以算法设计大师克努特命名的首届克努特奖,美国计算机协会(ACM)也把2000年度的图灵奖授予他。

人生宛如一出圆舞, 总要回到情系千里的故土。出国多年,姚期智仍然心系祖国,他认为中国的图灵之路走了三分之一,“希望能为中国和同胞尽点微薄之力”。2004年,姚期智决定将57岁以后的人生回归中国大陆,开创科学研究的新舞台。他毅然辞去了普林斯顿大学终身教职,卖掉了在美国的房子,正式加盟清华大学高等研究中心任全职教授。“我所学的东西能有机会在我出生的中国生根,有条件在该领域为中国培养出世界级的研究人员来,我觉得这是一件非常有意义的事情。”

到清华大学仅1年半,姚期智就发起了志在培养国际计算机科学领军人物的“软件科学实验班”。他最看重清华有许多很好、很有潜力的学生,“我回中国的一个目的,就是希望在短时间内,在中国,至少在我的研究领域,能够创造出一流的研究环境。”姚期智坚定地说,“我们要建立一个计算机领域的‘超级公路’,使得我们的学生从本科生开始,一直到研究生、教授,在中国工作可以比世界任何其他地方机会更好,也更感到荣耀。”

短短几年,姚期智带领他的清华团队在理论计算机科学研究方面颇有斩获。除填补了中国在《美国科学院院刊》等前沿国际刊物上发文的空白,在2006年理论计算机科学领域最顶级的学术会议FOCS上,清华大学计算机系有3篇论文入选,实现了国内学者在该会议上“零的突破”,其中一篇还获得2006年度FOCS最佳论文奖。

2007年3月29日,姚期智领导成立了清华大学理论计算机科学研究中心。他从清华开始,逐步建立中国的计算机理论科学的研究队伍,试图在国际上造成影响。姚期智饱含深情地说:“在国内,我所专长的这门学科,发展还是比较迟缓。而我们有这么多人才,能够教给他们这门学问并引导他们朝这方面走,是最快乐的事情。”主要成就2000年的图灵奖

Data processors

Figure 1.1 A signle purpose computing machine

Before discussing the Turing model, let us define a computer as a data processor. Using this definition, a computer acts a black box that accepts input data, processes the data, and created output data (Figure 1.1). Although this model can define the functionality of a computer today, it is too general. In this model, a pocket calculator is also a computer (which it is, in a literal sense).

Another problem with this model is that it does not specify the type of processing, or whether more than one type of processing is possible. In other words, it is not clear how many types or sets of operations a machine based on this model can perform. Is it a specific-purpose machine or a general-purpose machine?

This model could represent a specific-purpose computer (or processor) that is designed to do a single job, such as controlling the temperature of a building or controlling the fuel usages in a car. However, computers, as the term is used today, are general-purpose mahines. They can do many different types of tasks. This implies that we need to change this model into the Turing model to be able to reflect the actual computers of today.

Programmable data processors

The Turing model is a better model for a general-purpose computer. This model adds an extra element to the specific computing machine: the program. A program is a set of instructions that tells the computer what to do with data. Figure 1.2 shows the Turing model.

In the Turing model, the output data depends on the combination of two factors: the input data and the program. With the same input, we can generate different outputs if we change the program. Similarly, with the same program, we can generate different outputs if we change the input data. Finally, if the input data and the program remain the same, the output should be the same. Let us look at three cases.

Figure 1.2 A computer based on the Turing model: programmable data processor

Figure 1.3 shows the same sorting program with different input data, although the program is the same, the outputs are different, because different input data is processed.

Figure 1.3 The same program, different data

Figure 1.4 shows the same input data with different programs. Each program makes the computer perform different operations on the input data. The first program sorts the data, the second adds the data, and the thired finds the smallest number.

Figure 1.4 The same input, different program

We expect the same result each time if both input data and the program are the same, of course. In other words, when the same program is run with the same input data, we expect the same output.

--------

A bank may place copies of an account database in two different cities, say New York and San Francisco.

A query is always forwarded to the nearest copy.

Assume a customer in San Francisco wants to add $100 to his account, which currently contains $1000.

At the same time, a bank employee in New York initiates an update by which the customer’s account is to be increased with 1 percent interest.

Problem: We sometimes need to guarantee that concurrent

updates on a replicated database are seen in

the same order everywhere:

• P1 adds $100 to an account (initial value: $1000)

• P2 increments account by 1%

• There are two replicas

Result: in absence of proper synchronization:

replica #1 ← $1111, while replica #2 ← $1110.

http://duanple.blog.163.com/blog/static/709717672011440267333/

实现状态机模型

实现分布式系统的一种简单方式就是,使用一组客户端集合然后向一个中央服务器发送命令。服务器可以看成是一个以某种顺序执行客户端命令的确定性状态机。该状态机有一个当前状态,通过输入一个命令来产生一个输出以及一个新的状态。比如一个分布式银行系统的客户端可能是一些出纳员,状态机状态由所有用户的账户余额组成。一个取款操作,通过执行一个减少账户余额的状态机命令(当且仅当余额大于等于取款数目时)实现,将新旧余额作为输出。

使用中央服务器的系统在该服务器失败的情况下,整个系统就失败了。因此,我们使用一组服务器来代替它,每个服务器都独立了实现了该状态机。因为状态机是确定性的,如果它们都按照相同的命令序列执行,那么就会产生相同的状态机状态和输出。一个产生命令的客户端,就可以使用任意服务器为它产生的输出。

--------

The universal Turing machine

A universal Turing machine, a machine that can do any computation if the appropriate program is provided, was the first description of a modern computer. It can be proved that a very powerful computer and a universal Turing machine can compute the same thing. We need only provide the data and the program -- the description of how to do the computation -- to either machine. In fact, a universal Turing machine is capable of computing anything that is computable.

A computer is a machine that manipulates data according to a list of instructions.

2.1.2 VON NEUMANN MODEL

http://baike.baidu.com/view/46087.htm

约翰·冯·诺依曼( John von Neumann,1903-1957),“现代电子计算机之父”,美籍匈牙利人,物理学家、数学家、发明家,“现代电子计算机之父”即电脑(即EDVAC,它是世界上第一台现代意义的通用计算机)的发明者。1903年12月28日生于匈牙利的布达佩斯,父亲是一个银行家,家境富裕,十分注意对孩子的教育.冯·诺依曼从小聪颖过人,兴趣广泛,读书过目不忘.据说他6岁时就能用古希腊语同父亲闲谈,一生掌握了七种语言.最擅德语,可在他用德语思考种种设想时,又能以阅读的速度译成英语.他对读过的书籍和论文.能很快一句不差地将内容复述出来,而且若干年之后,仍可如此.1911年一1921年,冯·诺依曼在布达佩斯的卢瑟伦中学读书期间,就崭露头角而深受老师的器重.在费克特老师的个别指导下并合作发表了第一篇数学论文,此时冯·诺依曼还不到18岁.1921年一1923年在苏黎世联邦工业大学学习.很快又在1926年以优异的成绩获得了布达佩斯大学数学博士学位,此时冯·诺依曼年仅22岁.1927年一1929年冯·诺依曼相继在柏林大学和汉堡大学担任数学讲师。1930年接受了普林斯顿大学客座教授的职位,西渡美国.1931年他成为美国普林斯顿大学的第一批终身教授,那时,他还不到30岁。1933年转到该校的高级研究所,成为最初六位教授之一,并在那里工作了一生. 冯·诺依曼是普林斯顿大学、宾夕法尼亚大学、哈佛大学、伊斯坦堡大学、马里兰大学、哥伦比亚大学和慕尼黑高等技术学院等校的荣誉博士.他是美国国家科学院、秘鲁国立自然科学院和意大利国立林且学院等院的院士. 1954年他任美国原子能委员会委员;1951年至1953年任美国数学会主席.

1954年夏,冯·诺依曼被发现患有癌症,1957年2月8日,在华盛顿去世,终年54岁.

简单的来说,他的精髓贡献是2点:2进制思想与程序内存思想。

Computers built on the Turing universal machine store data in their memory. Around 1944-1945, John von Neumann proposed that, since program and data are logically the same, programs should also be stored in the memory of a computer.

Four subsystems

Computers built on the von Neumann model divide the computer hardware into four subsystems: memory, arithmetic logic unit, control unit, and input/output (Figure 1.5).

Figure 1.5 von Neumann model

Memory is the storage area. This is where programs and data are stored during processing. We discuss the reasons for storing programs and data later in the chapter.

The arithmetic logic unit (ALU) is where calculation and logical operations take place. For a computer to act as a data processor, it must be able to do arithmetic operations on data (such as adding a list of numbers). It should also be able to do logical operations on data.

The control unit controls the operations of the memory, ALU, and the input/output subsystmes.

The input subsystem accepts input data and the program from outside the computer, while the output subsystem sends results of processing to the outside world. The definition of the input/output subsystem is very broad: it also includes secondary storage devices such as disk or tape that store data and programs for processing. When a disk stores data that results from processing, it is considered an output device; when data is read from the disk, it is considered as an input device.

The stored program concept

The von Neumann model states that the program must be stored in memory. This is totally different from the arthiteccure of early computers in which only the data was stored in memory; the programs for their tasks implemented by manipulating a set of switches or by changing the wiring system.

The memory of modern computers hosts both a program and its corresponding data. This implies that both the data and programs should have the same format, because they are stored in memory. In fact, they are stored as binary patterns in memory -- a sequence of 0s and 1s.

Sequential execution of instructions

A program in the von Neumann model is made of a finite number of instructions. In this model, the control unit fetches one instruction from memory, decodes it, and then executes it. In other words, the instructions are executed one after another. Of course, one instruction may request the control unit to jump to some previous or following instructions, but this does not mean that the instructions are not executed sequentially. Sequential execution of a program was the initial requirement of a computer based on the von Neumann model. Today's computers execute programs in the order that is most efficient.

2.1.3 Computer components

We can think of a computer as being made up of three components: computer hardware, data, and computer software.

Computer hardware

Computer hardware today has four components under the von Neumann model, although we can have different types of memory, different types of input/output subsystems, and so on.

Data

The von Neumann model clearly defines a computer as a data processing machine that accepts the input data, processes it, and outputs the result.

· Storing data

The von Neumann model does not define how data must be stored in a computer. If a computer is an electronic device, the best way to store data is in the form of an electrical signal, specifically its presence or absence. This implies that a computer can store data in one of two states.

Obviously, the data we use in daily life is not just in one of two states. For example, our numbering system uses digits that can take one of ten states (0 to 9). We cannot (as yes) store this type of information in a computer; it needs to be changed to another system that uses only two states (0 and 1). We also need to be able to process other types of data (text, image, audio, and video). These also cannot be stored in a computer directly, but need to be changed to the appropriate form (0s and 1s).

In Chapter 3, we will learn how to store different types of data as a binary pattern, a sequence of 0s and 1s. In Chapter 4, we show how data is manipulated, as a binary pattern, inside a computer.

· Organizing data

Although data should be stored in only one form inside a computer, a binary pattern, data outside a computer can take many forms. In addition, computers (and the notion of data processing) have created a new field of study known as data organizaion, which asks the question: can we organize our data into different entities and formats before storing it inside a computer? Today, data is not treated as a flat sequence of information. Instead, data is organized into small units, small units are organized into larger units, and so on. We will look at data from this point of view in Chapters 11-14.

Computer software

Computer software is a general term used to describe a collection of computer programs, procedures and documentation that perform some tasks on a computer system.The term includes application software such as word processors which perform productive tasks for users, system software such as operating systems, which interface with hardware to provide the necessary services for application software, and middleware which controls and co-ordinates distributed systems. Software includes websites, programs, video games etc. that are coded by programming languages like C, C++, etc.

The main feature of the Turing or von Neumann models is the concept of the program. Although early computers did not store the program in the computer’s memory, they did use the concept of programs. Programming those early computers meant changing the wiring systems or turning a set of switches on or off. Programming was therefore a task done by an operator or engineer before the actual data processing began.

· Programs must be stored

In the von Neumann model programs are stored in the computer’s memory. Not only do we need memory to hold data, but we also need memory to hold the program (Figure 1.6).

Figure 1.6 Program and data in memory

· A sequence of instructions

Another requirement of the model is that the program must consist of a sequence of instructions. Each instruction operates on one or more data items. Thus, an instruction can change the effect of a previous instruction. For example, Figure 1.7 shows a program that inputs two numbers, adds them, and prints the result. This program consists of four individual instructions.

Figure 1.7 A program made of instructions

We might ask why a program must be composed of instructions. The answer is reusability. Today, computers do millions of tasks. If the program for each task was an independent entity without anything in common with other programs, programming would be difficult. The Turing and von Neumann models make programming easier by defining the different instructions that can be used by computers. A programmer can then combine these instructions to make any number of programs. Each program can be a different combination of different instructions.

· Algorithms

The requirement for a program to consist of a sequence of instructions made programming possible, but it brought another dimension to using a computer. A programmer must not only learn the task performed by each instruction, but also learn how to combine these instructions to do a particular task. Looking at this issue differently, a programmer must first solve the problem in a step-by-step manner, then try to find the appropriate instruction (or series of instructions) to implement those steps. This step-by-step solution is called an algorithm. Algorithms play a very important role in computer science and are discussed in Chapter 8.

· Languages

At the beginning of the computer age there was only one computer language, machine language. Programmers wrote instructions (using binary patterns) to solve a problem. However, as programs became larger, writing long programs using these patterns became tedious. Computer scientists came up with the idea of using symbols to represent binary patterns, just as people use symbols (words) for commands in daily life. Of course, the symbols used in daily life are different than those used in computers. So the concept of computer languages was born. A natural language such as English is rich and has many rules to combine words correctly: a computer language, on the other hand, has a more limited number of symbols and also a limited number of words. We will study computer languages in Chapter 9.

· Software engineering

Something that was not defined in the von Neumann model is software engineering, which is the design and writing of structured programs. Today it is not acceptable just to write a program that does a task: the program must follow strict rules and principles. We discuss these principles, collectively known as software engineering, in Chapter 10.

· Operating systems

During the evolution of computers, scientists became aware that there was a series of instructions common to all programs. For example, instructions to tell a computer where to receive data and where to send data are needed by almost all programs. It is more efficient to write these instructions only once for the use of all programs. Thus the concept of the operating system emerged. An operating system originally worked as a manager to facilitate access to the computer’s components by a program, although today operating systems do much more. We will learn about them in Chapter 7.

2.1.4 History

In this section we briefly review the history of computing and computers. We divide this history into three periods.

Mechanical machines (before 1930) 机械计算机器

During this period, several computing machines were invented that bear little resemblance to the modern concept of a computer.

图2-2 齿轮加法器

1645年, 法国Pascal发明了齿轮式加减法器. In the 17th century, Blaise Pascal, a French mathematician and philosopher, invented Pascaline, a mechanical calculator for addition and subtraction operations. In the 20th century, when Niklaus Wirth invented a structured programming language, he called it Pascal to honor the inventor of the first mechanical calculator.

1673年,德国数学家Leibniz发明了乘除器. In the late 17th century, German mathematician Gottfried Leibnitz invented a more sophisticated mechanical calculator that could do multiplication and division as well as addition and subtraction. It was called Leibnitz’ Wheel.

The first machine that used the idea of storage and programming was the Jacquard loom, invented by Joseph-Marie Jacquard at the beginning of the 19th century. The loom used punched cards (like a stored program) to control the raising of the warp threads in the manufacture of textiles.

第一台现代意义的计算机

1821年,英国数学家C. Babbage设计了差分机,这是第一台可自动进行数学变换的机器,由于条件所限没有实现。他被誉为“计算机之父”。In 1823, Charles Babbage invented the Difference Engine, which could do more than simple arithmetic operations—it could solve polynomial equations too. Later, he invented a machine called the Analytical Engine that, to some extent, parallels the idea of modern computers. It had four components: a mill (corresponding to a modern ALU), a store (memory), an operator (control unit), and output (input/output).

图2-3 国数学家C. Babbage设计了差分机

In 1890, Herman Hollerith, working at the US Census Bureau, designed and built a programmer machine that could automatically read, tally, and sort data stored on punched cards.

The birth of electronic computers (1930–1950)电子计算机的诞生

Between 1930 and 1950, several computers were invented by scientists who could be considered the pioneers of the electronic computer industry.

· Early electronic computers

The early electronic computers of this period did not store the program in memory—all were programmed externally. Five computers were prominent during these years: ABC, Z1, Mark I, Colossus, and ENIAC.

现代第一台通用的大型电子数字计算机

1945年,ENIAC(Electronic Numerical Integrator and Computer)在宾夕法尼亚大学诞生。ENIAC用了近18000个真空管,重达30吨,耗电 150 千瓦,长30米,宽1米,高2.4米,每秒5000次加法运算。

图2-4 ENIAC

Computers based on the von Neumann model

The first computer based on von Neumann’s ideas was made in 1950 at the University of Pennsylvania and was called EDVAC. At the same time, a similar computer called EDSAC was built by Maurice Wilkes at Cambridge University in England.

迈向现代计算机

Alan Turing(1912-1954) 1936年上研究生时提出了图灵机(Turing Machine), 奠定了计算机的理论基础。

ACM Turing Award: the “Nobel Prize of computing”

John von Neumann(1903-1957) 1946年发表了一篇关于如何用数字来表示逻辑操作的论文, von Neumann 体系结构为现代计算机普遍采用。

Computer generations (1950–present) 计算机的诞生

Computers built after 1950 more or less follow the von Neumann model. They have become faster, smaller, and cheaper, but the principle is almost the same. Historians divide this period into generations, with each generation witnessing some major change in hardware or software (but not in the model).

The first generation (roughly 1950–1959) is characterized by the emergence of commercial computers.

Second-generation computers (roughly 1959–1965) used transistors instead of vacuum tubes. Two high-level programming languages, FORTRAN and COBOL invented and made programming easier. These two languages separated the programming task from the computer operation task. A civil engineer, for example could write a FORTRAN program to solve a problem without being involved in the electronic details of computer architecture.

Third generation

The invention of the integrated circuit reduced the cost and size of computers even further. Minicomputers appeared on the market. Canned programs, popularly known as software packages, became available. This generation lasted roughly from 1965 to 1975.

The fourth generation (approximately 1975–1985) saw the appearance of microcomputers. The first desktop calculator, the Altair 8800, became available in 1975. This generation also saw the emergence of computer networks.

Fifth generation

This open-ended generation started in 1985. It has witnessed the appearance of laptop and palmtop computers, improvements in secondary storage media (CD-ROM, DVD and so on), the use of multimedia, and the phenomenon of virtual reality.

基于von Neumann model,改进主要体现在硬件或软件方面(而不是模型),如表2-1所示。第一代,真空管计算机, 始于20世纪40年代末。第二代,晶体管计算机,始于20世纪50年代末。第三代,集成电路计算机,始于20世纪60年代中期。第四代,微处理器计算机,始于20世纪70年代早期。

2008年我国首款超百万亿次超级计算机曙光5000A在天津高新区曙光产业基地正式下线。成为继美国之后第二个能自主研制超百万亿次高性能计算机的国家。 它的运算速度超过每秒160万亿次,内存超过100TB,存储能力超过700TB。

性能:峰值运算速度达到每秒230万亿次浮点运算(230TFLOPS);单机柜性能7.5万亿次,单机柜耗电20KW,百万亿次计算仅需要约14个机柜,占地约15平方米。

表2-1 Modern von Neumann machine

威力:可在30秒内完成上海证交所10年的1000多支股票交易信息的200种证券指数的计算。可在3分钟内,可以同时完成4次36小时的中国周边、北方大部、北京周边、北京市的2008年奥运会需要的气象预报计算,包括风向、风速、温度、湿度等,精度1公里,即精确到每个奥运会场馆。

图2-5 曙光5000

2.2 计算机系统漫游

本节内容取自下面这本书的A Tour of Computer Systems章。等号线之间内容是编者加的。

Computer Systems: A Programmer's Perspective (CS:APP)

by Randal E. Bryant and David R. O'Hallaron, Prentice Hall, 2003

http://csapp.cs.cmu.edu/

中文版在,

http://net.pku.edu.cn/~course/cs101/2008/resource/CSAP_cn.pdf

A computer system consists of hardware and systems software that work together to run application programs. Specific implementations of systems change over time, but the underlying concepts do not. All computer systems have similar hardware and software components that perform similar functions. This book is written for programmers who want to get better at their craft by understanding how these components work and how they affect the correctness and performance of their programs.

You are poised for an exciting journey. If you dedicate yourself to learning the concepts in this book, then you will be on your way to becoming a rare “power programmer,” enlightened by an understanding of the underlying computer system and its impact on your application programs.

You are going to learn practical skills such as how to avoid strange numerical errors caused by the way that computers represent numbers. You will learn how to optimize your C code by using clever tricks that exploit the designs of modern processors and memory systems. You will learn how the compiler implements procedure calls and how to use this knowledge to avoid the security holes from buffer overflow bugs that plague network and Internet software. You will learn how to recognize and avoid the nasty errors during linking that confound the average programmer. You will learn how to write your own Unix shell, your own dynamic storage allocation package, and even your own Web server!

In their classic text on the C programming language [40], Kernighan and Ritchie introduce readers to C using the hello program shown in Figure 1.1. Although hello is a very simple program, every major part of the system must work in

___________________________________________________________code/intro/hello.c

1 #include

2

3 int main()

4 {

5 printf("hello, world\n");

6 }

___________________________________________________________code/intro/hello.c

Figure 1.1: The hello program.

concert in order for it to run to completion. In a sense, the goal of this book is to help you understand what happens and why, when you run hello on your system.

We begin our study of systems by tracing the lifetime of the hello program, from the time it is created by a programmer, until it runs on a system, prints its simple message, and terminates. As we follow the lifetime of the program, we will briefly introduce the key concepts, terminology, and components that come into play. Later chapters will expand on these ideas.

2.2.1 Information is Bits + Context

Our hello program begins life as a source program (or source file) that the programmer creates with an editor and saves in a text file called hello.c. The source program is a sequence of bits, each with a value of 0 or 1, organized in 8-bit chunks called bytes. Each byte represents some text character in the program.

Most modern systems represent text characters using the ASCII standard that represents each character with a unique byte-sized integer value. For example, Figure 1.2 shows the ASCII representation of the hello.c program.

# i n c l u d e < s t d i o .

35 105 110 99 108 117 100 101 32 60 115 116 100 105 111 46

h > \n \n i n t m a i n ( ) \n {

104 62 10 10 105 110 116 32 109 97 105 110 40 41 10 123

\n p r i n t f ( " h e l

10 32 32 32 32 112 114 105 110 116 102 40 34 104 101 108

l o , w o r l d \ n " ) ; \n }

108 111 44 32 119 111 114 108 100 92 110 34 41 59 10 125

Figure 1.2: The ASCII text representation of hello.c.

The hello.c program is stored in a file as a sequence of bytes. Each byte has an integer value that corresponds to some character. For example, the first byte has the integer value 35, which corresponds to the character ‘#’. The second byte has the integer value 105, which corresponds to the character ‘i’, and so on. Notice that each text line is terminated by the invisible newline character ‘\n’, which is represented by the integer value 10. Files such as hello.c that consist exclusively of ASCII characters are known as text files. All other files are known as binary files.

The representation of hello.c illustrates a fundamental idea: All information in a system — including disk files, programs stored in memory, user data stored in memory, and data transferred across a network— is represented as a bunch of bits. The only thing that distinguishes different data objects is the context in which we view them. For example, in different contexts, the same sequence of bytes might represent an integer, floating-point number, character string, or machine instruction.

As programmers, we need to understand machine representations of numbers because they are not the same as integers and real numbers. They are finite approximations that can behave in unexpected ways. This fundamental idea is explored in detail in Chapter 2.

Aside: The C programming language.

C was developed from 1969 to 1973 by Dennis Ritchie of Bell Laboratories. The American National Standards Institute (ANSI) ratified the ANSI C standard in 1989. The standard defines the C language and a set of library functions known as the C standard library. Kernighan and Ritchie describe ANSI C in their classic book, which is known affectionately as “K&R” [40]. In Ritchie’s words [64], C is “quirky, flawed, and an enormous success.” So why the success?

· C was closely tied with the Unix operating system. C was developed from the beginning as the system programming language for Unix. Most of the Unix kernel, and all of its supporting tools and libraries, were written in C. As Unix became popular in universities in the late 1970s and early 1980s, many people were exposed to C and found that they liked it. Since Unix was written almost entirely in C, it could be easily ported to new machines, which created an even wider audience for both C and Unix.

· C is a small, simple language. The design was controlled by a single person, rather than a committee, and the result was a clean, consistent design with little baggage. The K&R book describes the complete language and standard library, with numerous examples and exercises, in only 261 pages. The simplicity of C made it relatively easy to learn and to port to different computers.

· C was designed for a practical purpose. C was designed to implement the Unix operating system. Later, other people found that they could write the programs they wanted, without the language getting in the way.

C is the language of choice for system-level programming, and there is a huge installed base of application-level programs as well. However, it is not perfect for all programmers and all situations. C pointers are a common source of confusion and programming errors. C also lacks explicit support for useful abstractions such as classes, objects, and exceptions. Newer languages such as C++ and Java address these issues for application-level programs.

End Aside.

2.2.2 Programs Are Translated by Other Programs into Different Forms

The hello program begins life as a high-level C program because it can be read and understood by human beings in that form. However, in order to run hello.c on the system, the individual C statements must be translated by other programs into a sequence of low-level machine-language instructions. These instructions are then packaged in a form called an executable object program and stored as a binary disk file. Object programs are also referred to as executable object files.

On a Unix system, the translation from source file to object file is performed by a compiler driver:

unix> gcc -o hello hello.c

Here, the GCC compiler driver reads the source file hello.c and translates it into an executable object file hello. The translation is performed in the sequence of four phases shown in Figure 1.3. The programs that perform the four phases (preprocessor, compiler, assembler, and linker) are known collectively as the compilation system.

Figure 1.3: The compilation system.

· Preprocessing phase. The preprocessor (cpp) modifies the original C program according to directives that begin with the # character. For example, the #include command in line 1 of hello.c tells the preprocessor to read the contents of the system header file stdio.h and insert it directly into the program text. The result is another C program, typically with the .i suffix.

· Compilation phase. The compiler (cc1) translates the text file hello.i into the text file hello.s, which contains an assembly-language program. Each statement in an assembly-language program exactly describes one low-level machine-language instruction in a standard text form. Assembly language is useful because it provides a common output language for different compilers for different high-level languages. For example, C compilers and Fortran compilers both generate output files in the same assembly language.

· Assembly phase. Next, the assembler (as) translates hello.s into machine-language instructions, packages them in a form known as a relocatable object program, and stores the result in the object file hello.o. The hello.o file is a binary file whose bytes encode machine language instructions rather than characters. If we were to view hello.o with a text editor, it would appear to be gibberish.

· Linking phase. Notice that our hello program calls the printf function, which is part of the standard C library provided by every C compiler. The printf function resides in a separate precompiled object file called printf.o, which must somehow be merged with our hello.o program. The linker (ld) handles this merging. The result is the hello file, which is an executable object file (or simply executable) that is ready to be loaded into memory and executed by the system.

Aside: The GNU project.

GCC is one of many useful tools developed by the GNU (short for GNU’s Not Unix) project. The GNU project is a tax-exempt charity started by Richard Stallman in 1984, with the ambitious goal of developing a complete Unix-like system whose source code is unencumbered by restrictions on how it can be modified or distributed. As of 2002, the GNU project has developed an environment with all the major components of a Unix operating system, except for the kernel, which was developed separately by the Linux project. The GNU environment includes the EMACS editor, GCC compiler, GDB debugger, assembler, linker, utilities for manipulating binaries, and other components.

The GNU project is a remarkable achievement, and yet it is often overlooked. The modern open-source ovement (commonly associated with Linux) owes its intellectual origins to the GNU project’s notion of free software (“free” as in “free speech” not “free beer”). Further, Linux owes much of its popularity to the GNU tools, which provide the environment for the Linux kernel.

End Aside.

2.2.3 It Pays to Understand How Compilation Systems Work

For simple programs such as hello.c, we can rely on the compilation system to produce correct and efficient machine code. However, there are some important reasons why programmers need to understand how compilation systems work:

· Optimizing program performance. Modern compilers are sophisticated tools that usually produce good code. As programmers, we do not need to know the inner workings of the compiler in order to write efficient code. However, in order to make good coding decisions in our C programs, we do need a basic understanding of assembly language and how the compiler translates different C statements into assembly language. For example, is a switch statement always more efficient than a sequence of if-then-else statements? Just how expensive is a function call? Is a while loop more efficient than a do loop? Are pointer references more efficient than array indexes? Why does our loop run so much faster if we sum into a local variable instead of an argument that is passed by reference? Why do two functionally equivalent loops have such different running times?

In Chapter 3, we will introduce the Intel IA32 machine language and describe how compilers translate different C constructs into that language. In Chapter 5 you will learn how to tune the performance of your C programs by making simple transformations to the C code that help the compiler do its job. And in Chapter 6 you will learn about the hierarchical nature of the memory system,