175
History of computing hardware Computing hardware has been an important component of the process of calculation and data storage since it became useful for numerical values to be processed and shared. The earliest computing hardware was probably some form of tally stick ; later record keeping aids include Phoenician clay shapes which represented counts of items, probably livestock or grains, in containers. Something similar is found in early Minoan excavations. These seem to have been used by the merchants , accountants , and government officials of the time. Devices to aid computation have changed from simple recording and counting devices to the abacus , the slide rule , analog computers , and more recent electronic computers . Even today, an experienced abacus user using a device hundreds of years old can sometimes complete basic calculations more quickly than an unskilled person using an electronic calculator — though for more complex calculations, computers out-perform even the most skilled human. This article covers major developments in the history of computing hardware, and attempts to put them in context. For a detailed timeline of events, see the computing timeline article. The history of computing article is a related overview and treats methods intended for pen and paper, with or without the aid of tables. Earliest devices Humanity has used devices to aid in computation for millennia. One example is a device for establishing the checkered cloths of the counting houses served as simple for enumerating stacks of coins, by height. A more arithmetic- oriented machine is the abacus . The earliest form of abacus, the dust abacus, is thought to have been invented in

IT MODULE 1

Embed Size (px)

Citation preview

Page 1: IT MODULE 1

History of computing hardwareComputing hardware has been an important component of the process of calculation and data storage since it became useful for numerical values to be processed and shared. The earliest computing hardware was probably some form of tally stick; later record keeping aids include Phoenician clay shapes which represented counts of items, probably livestock or grains, in containers. Something similar is found in early Minoan excavations. These seem to have been used by the merchants, accountants, and government officials of the time.

Devices to aid computation have changed from simple recording and counting devices to the abacus, the slide rule, analog computers, and more recent electronic computers. Even today, an experienced abacus user using a device hundreds of years old can sometimes complete basic calculations more quickly than an unskilled person using an electronic calculator — though for more complex calculations, computers out-perform even the most skilled human.

This article covers major developments in the history of computing hardware, and attempts to put them in context. For a detailed timeline of events, see the computing timeline article. The history of computing article is a related overview and treats methods intended for pen and paper, with or without the aid of tables.

Earliest devices

Humanity has used devices to aid in computation for millennia. One example is a device for establishing the checkered cloths of the counting houses served as simple for enumerating stacks of coins, by height. A more arithmetic-oriented machine is the abacus. The earliest form of abacus, the dust abacus, is thought to have been invented in Babylonia.[citation needed] The Egyptian bead and wire abacus dates from 500 BC.[citation needed]

In 1623 Wilhelm Schickard built the first mechanical calculator[citation needed] and thus became the father of the computing era. Since his machine used techniques such as cogs and gears first developed for clocks, it was also called a 'calculating clock'. It was put to practical use by his friend Johannes Kepler, who revolutionized astronomy.

Machines by Blaise Pascal (the Pascaline, 1642).An original calculator by Pascal (1640) is preserved in the Zwinger Museum,and Gottfried Wilhelm von Leibniz (1671) followed. Around 1820, Charles Xavier Thomas created the first successful, mass-produced mechanical calculator, the Thomas Arithmometer, that could add, subtract, multiply, and divide. It was mainly based on Leibniz's work. Mechanical calculators, like the base-ten addiator, the comptometer, the Monroe, the Curta and the Addo-X remained in use until the 1970s.

Leibniz also described the binary numeral system, a central ingredient of all modern computers. However, up to the 1940s, many subsequent designs (including Charles

Page 2: IT MODULE 1

Babbage's machines of the 1800s and even ENIAC of 1945) were based on the harder-to-implement decimal system.

John Napier noted that multiplication and division of numbers can be performed by addition and subtraction, respectively, of logarithms of those numbers. Since these real numbers can be represented as distances or intervals on a line, the slide rule allowed multiplication and division operations to be carried out significantly faster than was previously possible. Slide rules were used by generations of engineers and other mathematically inclined professional workers, until the invention of the pocket calculator. The engineers in the Apollo program to send a man to the moon made many of their calculations on slide rules, which were accurate to 3 or 4 significant figures.

While producing the first logarithmic tables Napier needed to perform many multiplications and it was at this point that he designed Napier's bones.

1801: punched card technology

Punched card system of a music machine. Also referred to as Book music, a one-stop European medium for organs

Punched card system of a 19th Century loom

As early as 1725 Basile Bouchon used a perforated paper loop in a loom to establish the pattern to be reproduced on cloth, and in 1726 his co-worker Jean-Baptiste Falcon improved on his design by using perforated paper cards attached to one another for efficiency in adapting and changing the program. The Bouchon-Falcon loom was semi-automatic and required manual feed of the program.

In 1801, Joseph-Marie Jacquard developed a loom in which the pattern being woven was controlled by punched cards. The series of cards could be changed without changing the mechanical design of the loom. This was a landmark point in programmability.

Herman Hollerith invented a tabulating machine using punched cards in the 1880s.

In 1833, Charles Babbage moved on from developing his difference engine to developing a more complete design, the analytical engine, which would draw directly on Jacquard's punched cards for its programming.[1].

In 1890, the United States Census Bureau used punched cards and sorting machines designed by Herman Hollerith, to handle the flood of data from the decennial census mandated by the Constitution. Hollerith's company eventually became the core of IBM. IBM developed punched card technology into a powerful tool for business data-

Page 3: IT MODULE 1

processing and produced an extensive line of specialized unit record equipment. By 1950, the IBM card had become ubiquitous in industry and government. The warning printed on most cards intended for circulation as documents (checks, for example), "Do not fold, spindle or mutilate," became a motto for the post-World War II era.[1]

Leslie Comrie's articles on punched card methods and W.J. Eckert's publication of Punched Card Methods in Scientific Computation in 1940, described techniques which were sufficiently advanced to solve differential equations, perform multiplication and division using floating point representations, all on punched cards and unit record machines. The Thomas J. Watson Astronomical Computing Bureau, Columbia University performed astronomical calculations representing the state of the art in computing.

In many computer installations, punched cards were used until (and after) the end of the 1970s. For example, science and engineering students at many universities around the world would submit their programming assignments to the local computer centre in the form of a stack of cards, one card per program line, and then had to wait for the program to be queued for processing, compiled, and executed. In due course a printout of any results, marked with the submitter's identification, would be placed in an output tray outside the computer center. In many cases these results would comprise solely a printout of error messages regarding program syntax etc., necessitating another edit-compile-run cycle.[2]

Punched cards are still used and manufactured in the current century, and their distinctive dimensions (and 80-column capacity) can still be recognized in forms, records, and programs around the world.

In 1835 Charles Babbage described his analytical engine. It was the plan of a general-purpose programmable computer, employing punch cards for input and a steam engine for power. One crucial invention was to use gears for the function served by the beads of an abacus. In a real sense, computers all contain automatic abacuses (technically called the arithmetic logic unit or floating-point unit).

His initial idea was to use punch-cards to control a machine that could calculate and print logarithmic tables with huge precision (a specific purpose machine). Babbage's idea soon developed into a general-purpose programmable computer, his analytical engine.

While his design was sound and the plans were probably correct, or at least debuggable, the project was slowed by various problems. Babbage was a difficult man to work with and argued with anyone who didn't respect his ideas. All the parts for his machine had to be made by hand. Small errors in each item can sometimes sum up to large discrepancies in a machine with thousands of parts, which required these parts to be much better than the usual tolerances needed at the time. The project dissolved in disputes with the artisan who built parts and was ended with the depletion of government funding.

Ada Lovelace, Lord Byron's daughter, translated and added notes to the "Sketch of the Analytical Engine" by Federico Luigi, Conte Menabrea. She has become closely

Page 4: IT MODULE 1

associated with Babbage. Some claim she is the world's first computer programmer, however this claim and the value of her other contributions are disputed by many.

A reconstruction of the Difference Engine II, an earlier, more limited design, has been operational since 1991 at the London Science Museum. With a few trivial changes, it works as Babbage designed it and shows that Babbage was right in theory.

The museum used computer-operated machine tools to construct the necessary parts, following tolerances which a machinist of the period would have been able to achieve. Some feel that the technology of the time was unable to produce parts of sufficient precision, though this appears to be false. The failure of Babbage to complete the engine can be chiefly attributed to difficulties not only related to politics and financing, but also to his desire to develop an increasingly sophisticated computer. Today, many in the computer field term this sort of obsession creeping featuritis.

The US Government used Herman Hollerith's tabulating machine for taking the 1890 U.S. Census.

Following in the footsteps of Babbage, although unaware of his earlier work, was Percy Ludgate, an accountant from Dublin, Ireland. He independently designed a programmable mechanical computer, which he described in a work that was published in 1909.

[] 1930s–1960s: desktop calculators

Curta calculator

By the 1900s earlier mechanical calculators, cash registers, accounting machines, and so on were redesigned to use electric motors, with gear position as the representation for the state of a variable. Companies like Friden, Marchant Calculator and Monroe made desktop mechanical calculators from the 1930s that could add, subtract, multiply and divide. The word "computer" was a job title assigned to people who used these calculators to perform mathematical calculations. During the Manhattan project, future Nobel laureate Richard Feynman was the supervisor of the roomful of human computers, many of them women mathematicians, who understood the differential equations which were being solved for the war effort. Even the renowned Stanisław Ulam was pressed into service to translate the mathematics into computable approximations for the hydrogen bomb, after the war.

In 1948, the Curta was introduced. This was a small, portable, mechanical calculator that was about the size of a pepper grinder. Over time, during the 1950s and 1960s a variety of different brands of mechanical calculator appeared on the market.

Page 5: IT MODULE 1

The first all-electronic desktop calculator was the British ANITA Mk.VII, which used a Nixie tube display and 177 subminiature thyratron tubes. In June 1963, Friden introduced the four-function EC-130. It had an all-transistor design, 13-digit capacity on a 5-inch CRT, and introduced reverse Polish notation (RPN) to the calculator market at a price of $2200. The model EC-132 added square root and reciprocal functions. In 1965, Wang Laboratories produced the LOCI-2, a 10-digit transistorized desktop calculator that used a Nixie tube display and could compute logarithms.

With development of the integrated circuits and microprocessors, the expensive, large calculators were replaced with smaller electronic devices.

[] Pre-1940 analog computers

Cambridge differential analyzer, 1938

Before World War II, mechanical and electrical analog computers were considered the 'state of the art', and many thought they were the future of computing. Analog computers use continuously varying amounts of physical quantities, such as voltages or currents, or the rotational speed of shafts, to represent the quantities being processed. An ingenious example of such a machine was the Water integrator built in 1928; an electrical example is the Mallock machine built in 1941. Unlike modern digital computers, analog computers are not very flexible, and need to be reconfigured (i.e., reprogrammed) manually to switch them from working on one problem to another. Analog computers had an advantage over early digital computers in that they could be used to solve complex problems while the earliest attempts at digital computers were quite limited. But as digital computers have become faster and used larger memory (e.g., RAM or internal store), they have almost entirely displaced analog computers, and computer programming, or coding has arisen as another human profession.

Since computers were rare in this era, the solutions were often hard-coded into paper forms such as graphs and nomograms, which could then allow analog solutions to problems, such as the distribution of pressures and temperatures in a heating system.

Some of the most widely deployed analog computers included devices for aiming weapons, such as the Norden bombsight and Fire-control systems for naval vessels. Some of these stayed in use for decades after WWII. One example is the Mark I Fire Control Computer, deployed by the United States Navy on a variety of ships from destroyers to battleships.

The art of analog computing reached its zenith with the differential analyzer, invented by Vannevar Bush in 1930. Fewer than a dozen of these devices were ever built; the most powerful was constructed at the University of Pennsylvania's Moore School of Electrical Engineering, where the ENIAC was built. Digital electronic computers like the ENIAC spelled the end for most analog computing machines, but hybrid analog computers,

Page 6: IT MODULE 1

controlled by digital electronics, remained in substantial use into the 1950s and 1960s, and later in some specialized applications.

[] Early digital computers

The era of modern computing began with a flurry of development before and during World War II, as electronic circuits, relays, capacitors, and vacuum tubes replaced mechanical equivalents and digital calculations replaced analog calculations. Machines such as the Atanasoff–Berry Computer, the Z3, the Colossus, and ENIAC were built by hand using circuits containing relays or valves (vacuum tubes), and often used punched cards or punched paper tape for input and as the main (non-volatile) storage medium.

In later systems, temporary or working storage was provided by acoustic delay lines (which use the propagation time of sound through a medium such as liquid mercury or wire to briefly store data) or by Williams tubes (which use the ability of a television picture tube to store and retrieve data). By 1954, magnetic core memory was rapidly displacing most other forms of temporary storage, and dominated the field through the mid-1970s.

In this era, a number of different machines were produced with steadily advancing capabilities. At the beginning of this period, nothing remotely resembling a modern computer existed, except in the long-lost plans of Charles Babbage and the mathematical musings of Alan Turing and others. At the end of the era, devices like the EDSAC had been built, and are universally agreed to be digital computers. Defining a single point in the series as the "first computer" misses many subtleties.

Alan Turing's 1936 paper proved enormously influential in computing and computer science in two ways. Its main purpose was to prove that there were problems (namely the halting problem) that could not be solved by any sequential process. In doing so, Turing provided a definition of a universal computer, a construct that came to be called a Turing machine, a purely theoretical device that formalizes the concept of algorithm execution, replacing Kurt Gödel's more cumbersome universal language based on arithmetics. Except for the limitations imposed by their finite memory stores, modern computers are said to be Turing-complete, which is to say, they have algorithm execution capability equivalent to a universal Turing machine. This limited type of Turing completeness is sometimes viewed as a threshold capability separating general-purpose computers from their special-purpose predecessors.

For a computing machine to be a practical general-purpose computer, there must be some convenient read-write mechanism, punched tape, for example. For full versatility, the Von Neumann architecture uses the same memory both to store programs and data; virtually all contemporary computers use this architecture (or some variant). While it is theoretically possible to implement a full computer entirely mechanically (as Babbage's design showed), electronics made possible the speed and later the miniaturization that characterize modern computers.

Page 7: IT MODULE 1

There were three parallel streams of computer development in the World War II era, and two were either largely ignored or were deliberately kept secret. The first was the German work of Konrad Zuse. The second was the secret development of the Colossus computer in the UK. Neither of these had much influence on the various computing projects in the United States. After the war, British and American computing researchers cooperated on some of the most important steps towards a practical computing device.

[] Konrad Zuse's Z-series: the first program-controlled computers

A reproduction of Zuse's Z1 computer.

Working in isolation in Germany, Konrad Zuse started construction in 1936 of his first Z-series calculators featuring memory and (initially limited) programmability. Zuse's purely mechanical, but already binary Z1, finished in 1938, never worked reliably due to problems with the precision of parts.

Zuse's subsequent machine, the Z3, was finished in 1941. It was based on telephone relays and did work satisfactorily. The Z3 thus became the first functional program-controlled, all-purpose, digital computer. In many ways it was quite similar to modern machines, pioneering numerous advances, such as floating point numbers. Replacement of the hard-to-implement decimal system (used in Charles Babbage's earlier design) by the simpler binary system meant that Zuse's machines were easier to build and potentially more reliable, given the technologies available at that time. This is sometimes viewed as the main reason why Zuse succeeded where Babbage failed.

Programs were fed into Z3 on punched films. Conditional jumps were missing, but since the 1990s it has been proved theoretically that Z3 was still a universal computer (ignoring its physical storage size limitations). In two 1936 patent applications, Konrad Zuse also anticipated that machine instructions could be stored in the same storage used for data – the key insight of what became known as the Von Neumann architecture and was first implemented in the later British EDSAC design (1949). Zuse also claimed to have designed the first higher-level programming language, (Plankalkül), in 1945 (which was published in 1948) although it was implemented for the first time in 2000 by the Free University of Berlin – five years after Zuse died.

Zuse suffered setbacks during World War II when some of his machines were destroyed in the course of Allied bombing campaigns. Apparently his work remained largely unknown to engineers in the UK and US until much later, although at least IBM was aware of it as it financed his post-war startup company in 1946 in return for an option on Zuse's patents.

[] American developments

Page 8: IT MODULE 1

In 1937, Claude Shannon produced his master's thesis at MIT that implemented Boolean algebra using electronic relays and switches for the first time in history. Entitled A Symbolic Analysis of Relay and Switching Circuits, Shannon's thesis essentially founded practical digital circuit design.

In November of 1937, George Stibitz, then working at Bell Labs, completed a relay-based computer he dubbed the "Model K" (for "kitchen", where he had assembled it), which calculated using binary addition. Bell Labs authorized a full research program in late 1938 with Stibitz at the helm. Their Complex Number Calculator, completed January 8, 1940, was able to calculate complex numbers. In a demonstration to the American Mathematical Society conference at Dartmouth College on September 11, 1940, Stibitz was able to send the Complex Number Calculator remote commands over telephone lines by a teletype. It was the first computing machine ever used remotely, in this case over a phone line. Some participants in the conference who witnessed the demonstration were John Von Neumann, John Mauchly, and Norbert Wiener, who wrote about it in his memoirs.

In 1939, John Vincent Atanasoff and Clifford E. Berry of Iowa State University developed the Atanasoff–Berry Computer (ABC), a special purpose digital electronic calculator for solving systems of linear equations. (The original goal was to solve 29 simultaneous equations of 29 unknowns each, but due to errors in the card puncher mechanism the completed machine could only solve a few equations.) The design used over 300 vacuum tubes for high speed and employed capacitors fixed in a mechanically rotating drum for memory. Though the ABC machine was not programmable, it was the first to use electronic circuits. ENIAC co-inventor John Mauchly examined the ABC in June 1941, and its influence on the design of the later ENIAC machine is a matter of contention among computer historians. The ABC was largely forgotten until it became the focus of the lawsuit Honeywell v. Sperry Rand, the ruling of which invalidated the ENIAC patent (and several others) as, among many reasons, having been anticipated by Atanasoff's work.

In 1939, development began at IBM's Endicott laboratories on the Harvard Mark I. Known officially as the Automatic Sequence Controlled Calculator, the Mark I was a general purpose electro-mechanical computer built with IBM financing and with assistance from IBM personnel, under the direction of Harvard mathematician Howard Aiken. Its design was influenced by Babbage's Analytical Engine, using decimal arithmetic and storage wheels and rotary switches in addition to electromagnetic relays. It was programmable via punched paper tape, and contained several calculation units working in parallel. Later versions contained several paper tape readers and the machine could switch between readers based on a condition. Nevertheless, the machine was not quite Turing-complete. The Mark I was moved to Harvard University and began operation in May 1944.

[] Colossus

Page 9: IT MODULE 1

Colossus was used to break German ciphers during World War II.

During World War II, the British at Bletchley Park achieved a number of successes at breaking encrypted German military communications. The German encryption machine, Enigma, was attacked with the help of electro-mechanical machines called bombes. The bombe, designed by Alan Turing and Gordon Welchman, after the Polish cryptographic bomba (1938), ruled out possible Enigma settings by performing chains of logical deductions implemented electrically. Most possibilities led to a contradiction, and the few remaining could be tested by hand.

The Germans also developed a series of teleprinter encryption systems, quite different from Enigma. The Lorenz SZ 40/42 machine was used for high-level Army communications, termed "Tunny" by the British. The first intercepts of Lorenz messages began in 1941. As part of an attack on Tunny, Professor Max Newman and his colleagues helped specify the Colossus. The Mk I Colossus was built between March and December 1943 by Tommy Flowers and his colleagues at the Post Office Research Station at Dollis Hill in London and then shipped to Bletchley Park.

Colossus was the first totally electronic computing device. The Colossus used a large number of valves (vacuum tubes). It had paper-tape input and was capable of being configured to perform a variety of boolean logical operations on its data, but it was not Turing-complete. Nine Mk II Colossi were built (The Mk I was converted to a Mk II making ten machines in total). Details of their existence, design, and use were kept secret well into the 1970s. Winston Churchill personally issued an order for their destruction into pieces no larger than a man's hand. Due to this secrecy the Colossi were not included in many histories of computing. A reconstructed copy of one of the Colossus machines is now on display at Bletchley Park.

[] ENIAC

ENIAC performed ballistics trajectory calculations with 160 kW of power.

The US-built ENIAC (Electronic Numerical Integrator and Computer), often called the first electronic general-purpose computer because Konrad Zuse's earlier Z3 was electric but not electronic, publicly validated the use of electronics for large-scale computing. This was crucial for the development of modern computing, initially because of the enormous speed advantage, but ultimately because of the potential for miniaturization. Built under the direction of John Mauchly and J. Presper Eckert, it was 1,000 times faster than its contemporaries. ENIAC's development and construction lasted from 1943 to full operation at the end of 1945. When its design was proposed, many researchers believed that the thousands of delicate valves (i.e. vacuum tubes) would burn out often enough that the ENIAC would be so frequently down for repairs as to be useless. It was, however,

Page 10: IT MODULE 1

capable of up to thousands of operations per second for hours at a time between valve failures.

ENIAC was unambiguously a Turing-complete device. A "program" on the ENIAC, however, was defined by the states of its patch cables and switches, a far cry from the stored program electronic machines that evolved from it. To program it meant to rewire it. At the time, however, unaided calculation was seen as enough of a triumph to view the solution of a single problem as the object of a program. (Improvements completed in 1948 made it possible to execute stored programs set in function table memory, which made programming less a "one-off" effort, and more systematic.)

Adapting ideas developed by Eckert and Mauchly after recognizing the limitations of ENIAC, John von Neumann wrote a widely-circulated report describing a computer design (the EDVAC design) in which the programs and working data were both stored in a single, unified store. This basic design, which became known as the von Neumann architecture, would serve as the basis for the development of the first really flexible, general-purpose digital computers.

[] First-generation von Neumann machine and the other works

"Baby" at the Museum of Science and Industry in Manchester (MSIM), England

The first working von Neumann machine was the Manchester "Baby" or Small-Scale Experimental Machine, built at the University of Manchester in 1948; it was followed in 1949 by the Manchester Mark I computer which functioned as a complete system using the Williams tube and magnetic drum for memory, and also introduced index registers. The other contender for the title "first digital stored program computer" had been EDSAC, designed and constructed at the University of Cambridge. Operational less than one year after the Manchester "Baby", it was also capable of tackling real problems. EDSAC was actually inspired by plans for EDVAC (Electronic Discrete Variable Automatic Computer), the successor to ENIAC; these plans were already in place by the time ENIAC was successfully operational. Unlike ENIAC, which used parallel processing, EDVAC used a single processing unit. This design was simpler and was the first to be implemented in each succeeding wave of miniaturization, and increased reliability. Some view Manchester Mark I / EDSAC / EDVAC as the "Eves" from which nearly all current computers derive their architecture.

The first universal programmable computer in the Soviet Union was created by a team of scientists under direction of Sergei Alekseyevich Lebedev from Kiev Institute of Electrotechnology, Soviet Union (now Ukraine). The computer MESM (МЭСМ, Small Electronic Calculating Machine) became operational in 1950. It had about 6,000 vacuum tubes and consumed 25 kW of power. It could perform approximately 3,000 operations

Page 11: IT MODULE 1

per second. Another early machine was CSIRAC, an Australian design that ran its first test program in 1949.

In October 1947, the directors of J. Lyons & Company, a British catering company famous for its teashops but with strong interests in new office management techniques, decided to take an active role in promoting the commercial development of computers. By 1951 the LEO I computer was operational and ran the world's first regular routine office computer job.

Manchester University's machine became the prototype for the Ferranti Mark I. The first Ferranti Mark I machine was delivered to the University in February, 1951 and at least nine others were sold between 1951 and 1957.

UNIVAC I, above, the first commercial electronic computer in the United States (third in the world), achieved 1900 operations per second in a smaller and more efficient package than ENIAC.

In June 1951, the UNIVAC I (Universal Automatic Computer) was delivered to the U.S. Census Bureau. Although manufactured by Remington Rand, the machine often was mistakenly referred to as the "IBM UNIVAC". Remington Rand eventually sold 46 machines at more than $1 million each. UNIVAC was the first 'mass produced' computer; all predecessors had been 'one-off' units. It used 5,200 vacuum tubes and consumed 125 kW of power. It used a mercury delay line capable of storing 1,000 words of 11 decimal digits plus sign (72-bit words) for memory. Unlike IBM machines it was not equipped with a punch card reader but 1930s style metal magnetic tape input, making it incompatible with some existing commercial data stores. High speed punched paper tape and modern-style magnetic tapes were used for input/output by other computers of the era.

In November 1951, the J. Lyons company began weekly operation of a bakery valuations job on the LEO (Lyons Electronic Office). This was the first business application to go live on a stored program computer.

In 1952, IBM publicly announced the IBM 701 Electronic Data Processing Machine, the first in its successful 700/7000 series and its first IBM mainframe computer. The IBM 704, introduced in 1954, used magnetic core memory, which became the standard for large machines. The first implemented high-level general purpose programming language, Fortran, was also being developed at IBM for the 704 during 1955 and 1956 and released in early 1957. (Konrad Zuse's 1945 design of the high-level language Plankalkül was not implemented at that time.)

IBM introduced a smaller, more affordable computer in 1954 that proved very popular. The IBM 650 weighed over 900 kg, the attached power supply weighed around 1350 kg and both were held in separate cabinets of roughly 1.5 meters by 0.9 meters by 1.8

Page 12: IT MODULE 1

meters. It cost $500,000 or could be leased for $3,500 a month. Its drum memory was originally only 2000 ten-digit words, and required arcane programming for efficient computing. Memory limitations such as this were to dominate programming for decades afterward, until the evolution of a programming model which was more sympathetic to software development.

In 1955, Maurice Wilkes invented microprogramming, which was later widely used in the CPUs and floating-point units of mainframe and other computers, such as the IBM 360 series. Microprogramming allows the base instruction set to be defined or extended by built-in programs (now sometimes called firmware, microcode, or millicode).

In 1956, IBM sold its first magnetic disk system, RAMAC (Random Access Method of Accounting and Control). It used 50 24-inch metal disks, with 100 tracks per side. It could store 5 megabytes of data and cost $10,000 per megabyte. (As of 2006, magnetic storage, in the form of hard disks, costs less than one tenth of a cent per megabyte).

[] Post-1960: third generation and beyond

Main article: History of computing hardware (1960s-present)

The explosion in the use of computers began with 'Third Generation' computers. These relied on Jack St. Clair Kilby's and Robert Noyce's independent invention of the integrated circuit (or microchip), which later led to the invention of the microprocessor, by Ted Hoff and Federico Faggin at Intel.

During the 1960s there was considerable overlap between second and third generation technologies. As late as 1975, Sperry Univac continued the manufacture of second-generation machines such as the UNIVAC 494.

The microprocessor led to the development of the microcomputer, small, low-cost computers that could be owned by individuals and small businesses. Microcomputers, the first of which appeared in the 1970s, became ubiquitous in the 1980s and beyond. Steve Wozniak, co-founder of Apple Computer, is credited with developing the first mass-market home computers. However, his first computer, the Apple I, came out some time after the KIM-1 and Altair 8800, and the first Apple computer with graphic and sound capabilities came out well after the Commodore PET. Computing has evolved with microcomputer architectures, with features added from their larger brethren, now dominant in most market segments.

An indication of the rapidity of development of this field can be inferred by the Burks, Goldstein, von Neuman, seminal article, documented in the Datamation September-October 1962 issue, which was written, as a preliminary version 15 years earlier. (See the references below.) By the time that anyone had time to write anything down, it was obsolete.

Page 13: IT MODULE 1

MicroprocessorsA microprocessor is a programmable digital electronic component that incorporates the functions of a central processing unit (CPU) on a single semiconducting integrated circuit (IC). The microprocessor was born by reducing the word size of the CPU from 32 bits to 4 bits, so that the transistors of its logic circuits would fit onto a single part. One or more microprocessors typically serve as the CPU in a computer system, embedded system, or handheld device.

Microprocessors made possible the advent of the microcomputer in the mid-1970s. Before this period, electronic CPUs were typically made from bulky discrete switching devices (and later small-scale integrated circuits) containing the equivalent of only a few transistors. By integrating the processor onto one or a very few large-scale integrated circuit packages (containing the equivalent of thousands or millions of discrete transistors), the cost of processor power was greatly reduced. Since the advent of the IC in the mid-1970s, the microprocessor has become the most prevalent implementation of the CPU, nearly completely replacing all other forms. See History of computing hardware for pre-electronic and early electronic computers.

The evolution of microprocessors has been known to follow Moore's Law when it comes to steadily increasing performance over the years. This law suggests that the complexity of an integrated circuit, with respect to minimum component cost, doubles every 24 months. This dictum has generally proven true since the early 1970s. From their humble beginnings as the drivers for calculators, the continued increase in power has led to the dominance of microprocessors over every other form of computer; every system from the largest mainframes to the smallest handheld computers now uses a microprocessor at its core.

[] History

The 4004 with cover removed (left) and as actually used (right).

Three projects arguably delivered a complete microprocessor at about the same time, namely Intel's 4004, Texas Instruments' TMS 1000, and Garrett AiResearch's Central Air Data Computer.

In 1968, Garrett AiResearch, with designer Ray Holt and Steve Geller, were invited to produce a digital computer to compete with electromechanical systems then under development for the main flight control computer in the US Navy's new F-14 Tomcat fighter. The design was complete by 1970, and used a MOS-based chipset as the core CPU. The design was significantly (approx 20 times) smaller and much more reliable than the mechanical systems it competed against, and was used in all of the early Tomcat models. This system contained a "a 20-bit, pipelined, parallel multi-microprocessor". However, the system was considered so advanced that the Navy refused to allow

Page 14: IT MODULE 1

publication of the design until 1997. For this reason the CADC, and the MP944 chipset it used, are fairly unknown even today. see First Microprocessor Chip Set

TI developed the 4-bit TMS 1000 and stressed pre-programmed embedded applications, introducing a version called the TMS1802NC on September 17, 1971, which implemented a calculator on a chip. The Intel chip was the 4-bit 4004, released on November 15, 1971, developed by Federico Faggin and Marcian Hoff.

TI filed for the patent on the microprocessor. Gary Boone was awarded U.S. Patent 3,757,306   for the single-chip microprocessor architecture on September 4, 1973. It may never be known which company actually had the first working microprocessor running on the lab bench. In both 1971 and 1976, Intel and TI entered into broad patent cross-licensing agreements, with Intel paying royalties to TI for the microprocessor patent. A nice history of these events is contained in court documentation from a legal dispute between Cyrix and Intel, with TI as intervenor and owner of the microprocessor patent.

Interestingly, a third party (Gilbert Hyatt) was awarded a patent which might cover the "microprocessor". See a webpage claiming an invention pre-dating both TI and Intel, describing a "microcontroller". According to a rebuttal and a commentary, the patent was later invalidated, but not before substantial royalties were paid out.

A computer-on-a-chip is a variation of a microprocessor which combines the microprocessor core (CPU), some memory, and I/O (input/output) lines, all on one chip. The computer-on-a-chip patent, called the "microcomputer patent" at the time, U.S. Patent 4,074,351   , was awarded to Gary Boone and Michael J. Cochran of TI. Aside from this patent, the standard meaning of microcomputer is a computer using one or more microprocessors as its CPU(s), while the concept defined in the patent is perhaps more akin to a microcontroller.

According to A History of Modern Computing, (MIT Press), pp. 220–21, Intel entered into a contract with Computer Terminals Corporation, later called Datapoint, of San Antonio TX, for a chip for a terminal they were designing. Datapoint later decided not to use the chip, and Intel marketed it as the 8008 in April, 1972. This was the world's first 8-bit microprocessor. It was the basis for the famous "Mark-8" computer kit advertised in the magazine Radio-Electronics in 1974. The 8008 and its successor, the world-famous 8080, opened up the microprocessor component marketplace.

[] Notable 8-bit designs

The 4004 was later followed in 1972 by the 8008, the world's first 8-bit microprocessor. These processors are the precursors to the very successful Intel 8080 (1974), Zilog Z80 (1976), and derivative Intel 8-bit processors. The competing Motorola 6800 was released August 1974. Its architecture was cloned and improved in the MOS Technology 6502 in 1975, rivaling the Z80 in popularity during the 1980s.

Page 15: IT MODULE 1

Both the Z80 and 6502 concentrated on low overall cost, through a combination of small packaging, simple computer bus requirements, and the inclusion of circuitry that would normally have to be provided in a separate chip (for instance, the Z80 included a memory controller). It was these features that allowed the home computer "revolution" to take off in the early 1980s, eventually delivering machines that sold for US$99.

The Western Design Center, Inc. (WDC) introduced the CMOS 65C02 in 1982 and licensed the design to several companies which became the core of the Apple IIc and IIe personal computers, medical implantable grade pacemakers and defibrilators, automotive, industrial and consumer devices. WDC pioneered the licensing of microprocessor technology which was later followed by ARM and other microprocessor Intellectual Property (IP) providers in the 1990’s.

Motorola trumped the entire 8-bit world by introducing the MC6809 in 1978, arguably one of the most powerful, orthogonal, and clean 8-bit microprocessor designs ever fielded – and also one of the most complex hardwired logic designs that ever made it into production for any microprocessor. Microcoding replaced hardwired logic at about this point in time for all designs more powerful than the MC6809 – specifically because the design requirements were getting too complex for hardwired logic.

Another early 8-bit microprocessor was the Signetics 2650, which enjoyed a brief flurry of interest due to its innovative and powerful instruction set architecture.

A seminal microprocessor in the world of spaceflight was RCA's RCA 1802 (aka CDP1802, RCA COSMAC) (introduced in 1976) which was used in NASA's Voyager and Viking spaceprobes of the 1970s, and onboard the Galileo probe to Jupiter (launched 1989, arrived 1995). RCA COSMAC was the first to implement C-MOS technology. The CDP1802 was used because it could be run at very low power,* and because its production process (Silicon on Sapphire) ensured much better protection against cosmic radiation and electrostatic discharges than that of any other processor of the era. Thus, the 1802 is said to be the first radiation-hardened microprocessor.

[] 16-bit designs

The first multi-chip 16-bit microprocessor was the National Semiconductor IMP-16, introduced in early 1973. An 8-bit version of the chipset was introduced in 1974 as the IMP-8. During the same year, National introduced the first 16-bit single-chip microprocessor, the National Semiconductor PACE, which was later followed by an NMOS version, the INS8900.

Other early multi-chip 16-bit microprocessors include one used by Digital Equipment Corporation (DEC) in the LSI-11 OEM board set and the packaged PDP 11/03 minicomputer, and the Fairchild Semiconductor MicroFlame 9440, both of which were introduced in the 1975 to 1976 timeframe.

Page 16: IT MODULE 1

The first single-chip 16-bit microprocessor was TI's TMS 9900, which was also compatible with their TI-990 line of minicomputers. The 9900 was used in the TI 990/4 minicomputer, the TI-99/4A home computer, and the TM990 line of OEM microcomputer boards. The chip was packaged in a large ceramic 64-pin DIP package package, while most 8-bit microprocessors such as the Intel 8080 used the more common, smaller, and less expensive plastic 40-pin DIP. A follow-on chip, the TMS 9980, was designed to compete with the Intel 8080, had the full TI 990 16-bit instruction set, used a plastic 40-pin package, moved data 8 bits at a time, but could only address 16 KiB. A third chip, the TMS 9995, was a new design. The family later expanded to include the 99105 and 99110.

The Western Design Center, Inc. (WDC) introduced the CMOS 65816 16-bit upgrade of the WDC CMOS 65C02 in 1984. The 65816 16-bit microprocessor was the core of the Apple IIgs and later the Super Nintendo Entertainment System, making it one of the most popular 16-bit designs of all time.

Intel followed a different path, having no minicomputers to emulate, and instead "upsized" their 8080 design into the 16-bit Intel 8086, the first member of the x86 family which powers most modern PC type computers. Intel introduced the 8086 as a cost effective way of porting software from the 8080 lines, and succeeded in winning much business on that premise. The 8088, a version of the 8086 that used an external 8-bit data bus, was the microprocessor in the first IBM PC, the model 5150. Following up their 8086 and 8088, Intel released the 80186, 80286 and, in 1985, the 32-bit 80386, cementing their PC market dominance with the processor family's backwards compatibility.

The integrated microprocessor memory management unit (MMU) was developed by Childs et al. of Intel, and awarded US patent number 4,442,484.

[] 32-bit designs

Upper interconnect layers on an Intel 80486DX2 die.

16-bit designs were in the market only briefly when full 32-bit implementations started to appear.

The most famous of the 32-bit designs is the MC68000, introduced in 1979. The 68K, as it was widely known, had 32-bit registers but used 16-bit internal data paths, and a 16-bit external data bus to reduce pin count, and supported only 24-bit addresses. Motorola generally described it as a 16-bit processor, though it clearly has 32-bit architecture. The combination of high speed, large (16 mebibytes) memory space and fairly low costs made it the most popular CPU design of its class. The Apple Lisa and Macintosh designs made use of the 68000, as did a host of other designs in the mid-1980s, including the Atari ST and Commodore Amiga.

Page 17: IT MODULE 1

The world's first single-chip fully-32-bit microprocessor, with 32-bit data paths, 32-bit buses, and 32-bit addresses, was the AT&T Bell Labs BELLMAC-32A, with first samples in 1980, and general production in 1982 (See this bibliographic reference and this general reference). After the divestiture of AT&T in 1984, it was renamed the WE 32000 (WE for Western Electric), and had two follow-on generations, the WE 32100 and WE 32200. These microprocessors were used in the AT&T 3B5 and 3B15 minicomputers; in the 3B2, the world's first desktop supermicrocomputer; in the "Companion", the world's first 32-bit laptop computer; and in "Alexander", the world's first book-sized supermicrocomputer, featuring ROM-pack memory cartridges similar to today's gaming consoles. All these systems ran the UNIX System V operating system.

Intel's first 32-bit microprocessor was the iAPX 432, which was introduced in 1981 but was not a commercial success. It had an advanced capability-based object-oriented architecture, but poor performance compared to other competing architectures such as the Motorola 68000.

Motorola's success with the 68000 led to the MC68010, which added virtual memory support. The MC68020, introduced in 1985 added full 32-bit data and address busses. The 68020 became hugely popular in the Unix supermicrocomputer market, and many small companies (e.g., Altos, Charles River Data Systems) produced desktop-size systems. Following this with the MC68030, which added the MMU into the chip, the 68K family became the processor for everything that wasn't running DOS. The continued success led to the MC68040, which included an FPU for better math performance. A 68050 failed to achieve its performance goals and was not released, and the follow-up MC68060 was released into a market saturated by much faster RISC designs. The 68K family faded from the desktop in the early 1990s.

Other large companies designed the 68020 and follow-ons into embedded equipment. At one point, there were more 68020s in embedded equipment than there were Intel Pentiums in PCs (See this webpage for this embedded usage information). The ColdFire processor cores are derivatives of the venerable 68020.

During this time (early to mid 1980s), National Semiconductor introduced a very similar 16-bit pinout, 32-bit internal microprocessor called the NS 16032 (later renamed 32016), the full 32-bit version named the NS 32032, and a line of 32-bit industrial OEM microcomputers. By the mid-1980s, Sequent introduced the first symmetric multiprocessor (SMP) server-class computer using the NS 32032. This was one of the design's few wins, and it disappeared in the late 1980s.

The MIPS R2000 (1984) and R3000 (1989) were highly successful 32-bit RISC microprocessors. They were used in high-end workstations and servers by SGI, among others.

Other designs included the interesting Zilog Z8000, which arrived too late to market to stand a chance and disappeared quickly.

Page 18: IT MODULE 1

In the late 1980s, "microprocessor wars" started killing off some of the microprocessors. Apparently, with only one major design win, Sequent, the NS 32032 just faded out of existence, and Sequent switched to Intel microprocessors.

From 1985 to 2003, the 32-bit x86 architectures became increasingly dominant in desktop, laptop, and server markets, and these microprocessors became faster and more capable. Intel had licensed early versions of the architecture to other companies, but declined to license the Pentium, so AMD and Cyrix built later versions of the architecture based on their own designs.During this span, these processors increased in complexity (transistor count) and capability (instructions/second) by at least a factor of 1000.

[] 64-bit designs in personal computers

While 64-bit microprocessor designs have been in use in several markets since the early 1990s, the early 2000s have seen the introduction of 64-bit microchips targeted at the PC market.

With AMD's introduction of the first 64-bit IA-32 backwards-compatible architecture, AMD64, in September 2003, followed by Intel's own x86-64 chips, the 64-bit desktop era began. Both processors can run 32-bit legacy apps as well as the new 64-bit software. With 64-bit Windows XP, Linux and Mac OS X (to a certain extent) that run 64-bit native, the software too is geared to utilise the full power of such processors. The move to 64 bits is more than just an increase in register size from the IA-32 as it also doubles the number of general-purpose registers for the aging CISC designs.

The move to 64 bits by PowerPC processors had been intended since the processors' design in the early 90s and was not a major cause of incompatibility. Existing integer registers are extended as are all related data pathways, but, as was the case with IA-32, both floating point and vector units had been operating at or above 64 bits for several years. Unlike what happened with IA-32 was extended to x86-64, no new general purpose registers were added in 64-bit PowerPC, so any performance gained when using the 64-bit mode for applications making no use of the larger address space is minimal.

[] Multicore designs

AMD Athlon 64 X2 3600 Dual core processorMain article: Multi-core (computing)

A different approach to improving a computer's performance is to add extra processors, as in symmetric multiprocessing designs which have been popular in servers and workstations since the early 1990s. Keeping up with Moore's Law is becoming increasingly challenging as chip-making technologies approach the physical limits of the technology.

Page 19: IT MODULE 1

In response, the microprocessor manufacturers look for other ways to improve performance, in order to hold on to the momentum of constant upgrades in the market.

A multi-core processor is simply a single chip containing more than one microprocessor core, effectively multiplying the potential performance with the number of cores (as long as the operating system and software is designed to take advantage of more than one processor). Some components, such as bus interface and second level cache, may be shared between cores. Because the cores are physically very close they interface at much faster clock speeds compared to discrete multiprocessor systems, improving overall system performance.

In 2005, the first mass-market dual-core processors were announced and as of 2007 dual-core processors are widely used in servers, workstations and PCs while quad-core processors are now available for high-end applications in both the home and professional environments.

[] RISC

In the mid-1980s to early-1990s, a crop of new high-performance RISC (reduced instruction set computer) microprocessors appeared, which were initially used in special purpose machines and Unix workstations, but have since become almost universal in all roles except the Intel-standard desktop.

The first commercial design was released by MIPS Technologies, the 32-bit R2000 (the R1000 was not released). The R3000 made the design truly practical, and the R4000 introduced the world's first 64-bit design. Competing projects would result in the IBM POWER and Sun SPARC systems, respectively. Soon every major vendor was releasing a RISC design, including the AT&T CRISP, AMD 29000, Intel i860 and Intel i960, Motorola 88000, DEC Alpha and the HP-PA.

Market forces have "weeded out" many of these designs, leaving the PowerPC as the main desktop RISC processor, with the SPARC being used in Sun designs only. MIPS continues to supply some SGI systems, but is primarily used as an embedded design, notably in Cisco routers. The rest of the original crop of designs have either disappeared, or are about to. Other companies have attacked niches in the market, notably ARM, originally intended for home computer use but since focussed at the embedded processor market. Today RISC designs based on the MIPS, ARM or PowerPC core power the vast majority of computing devices.

As of 2006, several 64-bit architectures are still produced. These include x86-64, MIPS, SPARC, Power Architecture, and Itanium.

[] Special-purpose designs

Page 20: IT MODULE 1

A 4-bit, 2 register, six assembly language instruction computer made entirely of 74-series chips.

Though the term "microprocessor" has traditionally referred to a single- or multi-chip CPU or system-on-a-chip (SoC), several types of specialized processing devices have followed from the technology. The most common examples are microcontrollers, digital signal processors (DSP) and graphics processing units (GPU). Many examples of these are either not programmable, or have limited programming facilities. For example, in general GPUs through the 1990s were mostly non-programmable and have only recently gained limited facilities like programmable vertex shaders. There is no universal consensus on what defines a "microprocessor", but it is usually safe to assume that the term refers to a general-purpose CPU of some sort and not a special-purpose processor unless specifically noted.

The RCA 1802 had what is called a static design, meaning that the clock frequency could be made arbitrarily low, even to 0 Hz, a total stop condition. This let the Voyager/Viking/Galileo spacecraft use minimum electric power for long uneventful stretches of a voyage. Timers and/or sensors would awaken/speed up the processor in time for important tasks, such as navigation updates, attitude control, data acquisition, and radio communication.

[] Market statistics

In 2003, about $44 billion (USD) worth of microprocessors were manufactured and sold. [1] Although about half of that money was spent on CPUs used in desktop or laptop personal computers, those count for only about 0.2% of all CPUs sold.

Binary Number System

The binary numeral system, or base-2 number system, is a numeral system that represents numeric values using two symbols, usually 0 and 1. More specifically, the usual base-2 system is a positional notation with a radix of 2. Owing to its straightforward implementation in electronic circuitry, the binary system is used internally by virtually all modern computers.

[] History

The ancient Indian mathematician Pingala presented the first known description of a binary numeral system around 800 BC written in Hindu numerals. The numeration system was based on the Eye of Horus Old Kingdom numeration system. [1]

A full set of 8 trigrams and 64 hexagrams, analogous to the 3-bit and 6-bit binary numerals, were known to the ancient Chinese in the classic text I Ching. Similar sets of binary combinations have also been used in traditional African divination systems such as Ifá as well as in medieval Western geomancy.

Page 21: IT MODULE 1

An ordered binary arrangement of the hexagrams of the I Ching, representing the decimal sequence from 0 to 63, and a method for generating the same, was developed by the Chinese scholar and philosopher Shao Yong in the 11th century. However, there is no evidence that Shao understood binary computation.

In 1605 Francis Bacon discussed a system by which letters of the alphabet could be reduced to sequences of binary digits, which could then be encoded as scarcely visible variations in the font in any random text. Importantly for the general theory of binary encoding, he added that this method could be used with any objects at all: "provided those objects be capable of a twofold difference onely; as by Bells, by Trumpets, by Lights and Torches, by the report of Muskets, and any instruments of like nature."[2] (See Bacon's cipher.)

The modern binary number system was fully documented by Gottfried Leibniz in the 17th century in his article Explication de l'Arithmétique Binaire. Leibniz's system used 0 and 1, like the modern binary numeral system.

In 1854, British mathematician George Boole published a landmark paper detailing a system of logic that would become known as Boolean algebra. His logical system proved instrumental in the development of the binary system, particularly in its implementation in electronic circuitry.

In 1937, Claude Shannon produced his master's thesis at MIT that implemented Boolean algebra and binary arithmetic using electronic relays and switches for the first time in history. Entitled A Symbolic Analysis of Relay and Switching Circuits, Shannon's thesis essentially founded practical digital circuit design.

In November of 1937, George Stibitz, then working at Bell Labs, completed a relay-based computer he dubbed the "Model K" (for "Kitchen", where he had assembled it), which calculated using binary addition. Bell Labs thus authorized a full research program in late 1938 with Stibitz at the helm. Their Complex Number Computer, completed January 8, 1940, was able to calculate complex numbers. In a demonstration to the American Mathematical Society conference at Dartmouth College on September 11, 1940, Stibitz was able to send the Complex Number Calculator remote commands over telephone lines by a teletype. It was the first computing machine ever used remotely over a phone line. Some participants of the conference who witnessed the demonstration were John Von Neumann, John Mauchly, and Norbert Wiener, who wrote about it in his memoirs.

[] Representation

A binary number can be represented by any sequence of bits (binary digits), which in turn may be represented by any mechanism capable of being in two mutually exclusive states. The following sequences of symbols could all be interpreted as the same binary numeric value of love:

Page 22: IT MODULE 1

0 1 1 0 1 1 0 0 0 1 1 0 1 1 1 1 0 1 1 1 0 1 1 0 0 1 1 0 0 1 0 1| - - | - - | | | - - | - - - - | - - - | - - | | - - | | - | -x o o x o o x x x o o x o o o o x o o o x o o x x o o x x o x oa z z a z z a a a z z a z z z z a z z z a z z a a z z a a z z z

A binary clock might use LEDs to express binary values. In this clock, each column of LEDs shows a binary-coded decimal numeral of the traditional sexagesimal time.

The numeric value represented in each case is dependent upon the value assigned to each symbol. In a computer, the numeric values may be represented by two different voltages; on a magnetic disk, magnetic polarities may be used. A "positive", "yes", or "on" state is not necessarily equivalent to the numerical value of one; it depends on the architecture in use.

In keeping with customary representation of numerals using Arabic numerals, binary numbers are commonly written using the symbols 0 and 1. When written, binary numerals are often subscripted, prefixed or suffixed in order to indicate their base, or radix. The following notations are equivalent:

100101 binary (explicit statement of format) 100101b (a suffix indicating binary format) 100101B (a suffix indicating binary format) bin 100101 (a prefix indicating binary format) 1001012 (a subscript indicating base-2 (binary) notation) %100101 (a prefix indicating binary format) 0b100101 (a prefix indicating binary format, common in programming languages)

When spoken, binary numerals are usually pronounced by pronouncing each individual digit, in order to distinguish them from decimal numbers. For example, the binary numeral "100" is pronounced "one zero zero", rather than "one hundred", to make its binary nature explicit, and for purposes of correctness. Since the binary numeral "100" is equal to the decimal value four, it would be confusing, and numerically incorrect, to refer to the numeral as "one hundred" so to speak.

[] Counting in binary

Counting in binary is similar to counting in any other number system. Beginning with a single digit, counting proceeds through each symbol, in increasing order. Decimal counting uses the symbols 0 through 9, while binary only uses the symbols 0 and 1.

When the symbols for the first digit are exhausted, the next-higher digit (to the left) is incremented, and counting starts over at 0. In decimal, counting proceeds like so:

000, 001, 002, ... 007, 008, 009, (rightmost digit starts over, and next digit is incremented) 010, 011, 012, ...

Page 23: IT MODULE 1

   ... 090, 091, 092, ... 097, 098, 099, (rightmost two digits start over, and next digit is incremented) 100, 101, 102, ...

After a digit reaches 9, an increment resets it to 0 but also causes an increment of the next digit to the left. In binary, counting is the same except that only the two symbols 0 and 1 are used. Thus after a digit reaches 1 in binary, an increment resets it to 0 but also causes an increment of the next digit to the left:

000, 001, (rightmost digit starts over, and next digit is incremented) 010, 011, (rightmost two digits start over, and next digit is incremented) 100, 101, ...

[] Binary simplified

One can think about binary by comparing it with our usual numbers. We use a base ten system. This means that the value of each position in a numerical value can be represented by one of ten possible symbols: 0, 1, 2, 3, 4, 5, 6, 7, 8, or 9. We are all familiar with these and how the decimal system works using these ten symbols. When we begin counting values, we should start with the symbol 0, and proceed to 9 when counting. We call this the "ones", or "units" place.

The "ones" place, with those digits, might be thought of as a multiplication problem. 5 can be thought of as 5 × 100 (10 to the zeroeth power, which equals 5 × 1, since any number to the zero power is one). As we move to the left of the ones place, we increase the power of 10 by one. Thus, to represent 50 in this same manner, it can be thought of as 5 × 101, or 5 × 10.

When we run out of symbols in the decimal numeral system, we "move to the left" one place and use a "1" to represent the "tens" place. Then we reset the symbol in the "ones" place back to the first symbol, zero.

Binary is a base two system which works just like our decimal system, however with only two symbols which can be used to represent numerical values: 0 and 1. We begin in the "ones" place with 0, then go up to 1. Now we are out of symbols, so to represent a higher value, we must place a "1" in the "twos" place, since we don't have a symbol we can use in the binary system for 2, like we do in the decimal system.

In the binary numeral system, the value represented as 10 is (1 × 21) + (0 × 20). Thus, it equals "2" in our decimal system.

Binary-to-decimal equivalence:

To see the actual algorithm used in computing the conversion, see the conversion guide below.

Page 24: IT MODULE 1

Here is another way of thinking about it: When you run out of symbols, for example 11111, add a "1" on the left end and reset all the numerals on the right to "0", producing 100000. This also works for symbols in the middle. Say the number is 100111. If you add one to it, you move the leftmost repeating "1" one space to the left (from the "fours" place to the "eights" place) and reset all the numerals on the right to "0", producing 101000.

[] Binary arithmetic

Arithmetic in binary is much like arithmetic in other numeral systems. Addition, subtraction, multiplication, and division can be performed on binary numerals.

[] Addition

The circuit diagram for a binary half adder, which adds two bits together, producing sum and carry bits.

The simplest arithmetic operation in binary is addition. Adding two single-digit binary numbers is relatively simple:

0 + 0 = 0 0 + 1 = 1 1 + 0 = 1 1 + 1 = 10 (carry:1)

Adding two "1" values produces the value "10" (spoken as "one-zero"), equivalent to the decimal value 2. This is similar to what happens in decimal when certain single-digit numbers are added together; if the result equals or exceeds the value of the radix (10), the digit to the left is incremented:

5 + 5 = 10 7 + 9 = 16

This is known as carrying in most numeral systems. When the result of an addition exceeds the value of the radix, the procedure is to "carry the one" to the left, adding it to the next positional value. Carrying works the same way in binary:

1 1 1 1 1 (carried digits) 0 1 1 0 1+ 1 0 1 1 1-------------= 1 0 0 1 0 0

In this example, two numerals are being added together: 011012 (13 decimal) and 101112 (23 decimal). The top row shows the carry bits used. Starting in the rightmost column, 1

Page 25: IT MODULE 1

+ 1 = 102. The 1 is carried to the left, and the 0 is written at the bottom of the rightmost column. The second column from the right is added: 1 + 0 + 1 = 102 again; the 1 is carried, and 0 is written at the bottom. The third column: 1 + 1 + 1 = 112. This time, a 1 is carried, and a 1 is written in the bottom row. Proceeding like this gives the final answer 1001002 (36 decimal).

When computers must add two numbers, the rule that: x ^ y = x + y % 2 for any two bits x and y allows for very fast calculation, as well.

[] Subtraction

Subtraction works in much the same way:

0 − 0 = 0 0 − 1 = 1 (with borrow) 1 − 0 = 1 1 − 1 = 0

One binary numeral can be subtracted from another as follows:

* * * * (starred columns are borrowed from) 1 1 0 1 1 1 0− 1 0 1 1 1----------------= 1 0 1 0 1 1 1

Subtracting a positive number is equivalent to adding a negative number of equal absolute value; computers typically use two's complement notation to represent negative values. This notation eliminates the need for a separate "subtract" operation. The subtraction can be summarized with this formula:

A - B = A + not B + 1

For further details, see two's complement.

[] Multiplication

Multiplication in binary is similar to its decimal counterpart. Two numbers A and B can be multiplied by partial products: for each digit in B, the product of that digit in A is calculated and written on a new line, shifted leftward so that its rightmost digit lines up with the digit in B that was used. The sum of all these partial products gives the final result.

Since there are only two digits in binary, there are only two possible outcomes of each partial multiplication:

If the digit in B is 0, the partial product is also 0

Page 26: IT MODULE 1

If the digit in B is 1, the partial product is equal to A

For example, the binary numbers 1011 and 1010 are multiplied as follows:

1 0 1 1 (A) × 1 0 1 0 (B) --------- 0 0 0 0 ← Corresponds to a zero in B + 1 0 1 1 ← Corresponds to a one in B + 0 0 0 0 + 1 0 1 1 --------------- = 1 1 0 1 1 1 0

See also Booth's multiplication algorithm.

[] Division

Binary division is again similar to its decimal counterpart:

__________1 0 1 | 1 1 0 1 1

Here, the divisor is 1012, or 5 decimal, while the dividend is 110112, or 27 decimal. The procedure is the same as that of decimal long division; here, the divisor 1012 goes into the first three digits 1102 of the dividend one time, so a "1" is written on the top line. This result is multiplied by the divisor, and subtracted from the first three digits of the dividend; the next digit (a "1") is included to obtain a new three-digit sequence:

1 __________1 0 1 | 1 1 0 1 1 − 1 0 1 ----- 0 1 1

The procedure is then repeated with the new sequence, continuing until the digits in the dividend have been exhausted:

1 0 1 __________1 0 1 | 1 1 0 1 1 − 1 0 1 ----- 0 1 1 − 0 0 0 ----- 1 1 1 − 1 0 1 ----- 1 0

Page 27: IT MODULE 1

Thus, the quotient of 110112 divided by 1012 is 1012, as shown on the top line, while the remainder, shown on the bottom line, is 102. In decimal, 27 divided by 5 is 5, with a remainder of 2.

[] Bitwise operations

Main article: bitwise operation

Though not directly related to the numerical interpretation of binary symbols, sequences of bits may be manipulated using Boolean logical operators. When a string of binary symbols is manipulated in this way, it is called a bitwise operation; the logical operators AND, OR, and XOR may be performed on corresponding bits in two binary numerals provided as input. The logical NOT operation may be performed on individual bits in a single binary numeral provided as input. Sometimes, such operations may be used as arithmetic short-cuts, and may have other computational benefits as well. For example, an arithmetic shift left of a binary number is the equivalent of multiplication by a (positive, integral) power of 2.

[] Conversion to and from other numeral systems

[] Decimal

To convert from a base-10 integer numeral to its base-2 (binary) equivalent, the number is divided by two, and the remainder is the least-significant bit. The (integer) result is again divided by two, its remainder is the next most significant bit. This process repeats until the result of further division becomes zero.

For example, 11810, in binary, is:

Operation Remainder

118 ÷ 2 = 59 0

59 ÷ 2 = 29 1

29 ÷ 2 = 14 1

14 ÷ 2 = 7 0

Page 28: IT MODULE 1

7 ÷ 2 = 3 1

3 ÷ 2 = 1 1

1 ÷ 2 = 0 1

Reading the sequence of remainders from the bottom up gives the binary numeral 11101102.

This method works for conversion from any base, but there are better methods for bases which are powers of two, such as octal and hexadecimal given below.

To convert from base-2 to base-10 is the reverse algorithm. Starting from the left, double the result and add the next digit until there are no more. For example to convert 1100101011012 to decimal:

Result Remaining digits

0 110010101101

0 × 2 + 1 = 1 10010101101

1 × 2 + 1 = 3 0010101101

3 × 2 + 0 = 6 010101101

6 × 2 + 0 = 12 10101101

12 × 2 + 1 = 25 0101101

25 × 2 + 0 = 50 101101

Page 29: IT MODULE 1

50 × 2 + 1 = 101 01101

101 × 2 + 0 = 202 1101

202 × 2 + 1 = 405 101

405 × 2 + 1 = 811 01

811 × 2 + 0 = 1622 1

1622 × 2 + 1 = 3245

The result is 324510.

The fractional parts of a number are converted with similar methods. They are again based on the equivalence of shifting with doubling or halving.

In a fractional binary number such as .110101101012, the first digit is , the second , etc. So if there is a 1 in the first place after the decimal, then the number is at least , and vice versa. Double that number is at least 1. This suggests the algorithm: Repeatedly double the number to be converted, record if the result is at least 1, and then throw away the integer part.

For example, 10, in binary, is:

Converting Result

0.

0.0

0.01

Page 30: IT MODULE 1

0.010

0.0101

Thus the repeating decimal fraction 0.3... is equivalent to the repeating binary fraction 0.01... .

Or for example, 0.110, in binary, is:

Converting Result

0.1 0.

0.1 × 2 = 0.2 < 1 0.0

0.2 × 2 = 0.4 < 1 0.00

0.4 × 2 = 0.8 < 1 0.000

0.8 × 2 = 1.6 ≥ 1 0.0001

0.6 × 2 = 1.2 ≥ 1 0.00011

0.2 × 2 = 0.4 < 1 0.000110

0.4 × 2 = 0.8 < 1 0.0001100

0.8 × 2 = 1.6 ≥ 1 0.00011001

0.6 × 2 = 1.2 ≥ 1 0.000110011

Page 31: IT MODULE 1

0.2 × 2 = 0.4 < 1 0.0001100110

This is also a repeating binary fraction 0.000110011... . It may come as a surprise that terminating decimal fractions can have repeating expansions in binary. It is for this reason that many are surprised to discover that 0.1 + ... + 0.1, (10 additions) differs from 1 in floating point arithmetic. In fact, the only binary fractions with terminating expansions are of the form of an integer divided by a power of 2, which 1/10 is not.

The final conversion is from binary to decimal fractions. The only difficulty arises with repeating fractions, but otherwise the method is to shift the fraction to an integer, convert it as above, and then divide by the appropriate power of two in the decimal base. For example:

x = 1100 .101110011100...= 1100101110 .0111001110...= 11001 .0111001110...= 1100010101

x = (789/62)10

Another way of converting from binary to decimal, often quicker for a person familiar with hexadecimal, is to do so indirectly—first converting (x in binary) into (x in hexadecimal) and then converting (x in hexadecimal) into (x in decimal).

[] Hexadecimal

Binary may be converted to and from hexadecimal somewhat more easily. This is due to the fact that the radix of the hexadecimal system (16) is a power of the radix of the binary system (2). More specifically, 16 = 24, so it takes four digits of binary to represent one digit of hexadecimal.

The following table shows each hexadecimal digit along with the equivalent decimal value and four-digit binary sequence:

Hex Dec Binary

0 0 0000

Hex Dec Binary

4 4 0100

Hex Dec Binary

8 8 1000

Hex Dec Binary

C 12 1100

Page 32: IT MODULE 1

1 1 0001

2 2 0010

3 3 0011

5 5 0101

6 6 0110

7 7 0111

9 9 1001

A 10 1010

B 11 1011

D 13 1101

E 14 1110

F 15 1111

To convert a hexadecimal number into its binary equivalent, simply substitute the corresponding binary digits:

3A16 = 0011 10102 E716 = 1110 01112

To convert a binary number into its hexadecimal equivalent, divide it into groups of four bits. If the number of bits isn't a multiple of four, simply insert extra 0 bits at the left (called padding). For example:

010100102 = 0101 0010 grouped with padding = 5216 110111012 = 1101 1101 grouped = DD16

To convert a hexadecimal number into its decimal equivalent, multiply the decimal equivalent of each hexadecimal digit by the corresponding power of 16 and add the resulting values:

C0E716 = (12 × 163) + (0 × 162) + (14 × 161) + (7 × 160) = (12 × 4096) + (0 × 256) + (14 × 16) + (7 × 1) = 49,38310

[] Octal

Binary is also easily converted to the octal numeral system, since octal uses a radix of 8, which is a power of two (namely, 23, so it takes exactly three binary digits to represent an octal digit). The correspondence between octal and binary numerals is the same as for the first eight digits of hexadecimal in the table above. Binary 000 is equivalent to the octal digit 0, binary 111 is equivalent to octal 7, and so on.

Octal Binary

0 000

Page 33: IT MODULE 1

1 001

2 010

3 011

4 100

5 101

6 110

7 111

Converting from octal to binary proceeds in the same fashion as it does for hexadecimal:

658 = 110 1012 178 = 001 1112

And from binary to octal:

1011002 = 101 1002 grouped = 548 100112 = 010 0112 grouped with padding = 238

And from octal to decimal:

658 = (6 × 81) + (5 × 80) = (6 × 8) + (5 × 1) = 5310 1278 = (1 × 82) + (2 × 81) + (7 × 80) = (1 × 64) + (2 × 8) + (7 × 1) = 8710

[] Representing real numbers

Non-integers can be represented by using negative powers, which are set off from the other digits by means of a radix point (called a decimal point in the decimal system). For example, the binary number 11.012 thus means:

1 × 21 (1 × 2 = 2) plus1 × 20 (1 × 1 = 1) plus

Page 34: IT MODULE 1

0 × 2-1 (0 × ½ = 0) plus1 × 2-2 (1 × ¼ = 0.25)

For a total of 3.25 decimal.

All dyadic rational numbers have a terminating binary numeral—the binary representation has a finite number of terms after the radix point. Other rational numbers have binary representation, but instead of terminating, they recur, with a finite sequence of digits repeating indefinitely. For instance

= = 0.0101010101...2

= = 0.10110100 10110100 10110100...2

The phenomenon that the binary representation of any rational is either terminating or recurring also occurs in other radix-based numeral systems. See, for instance, the explanation in decimal. Another similarity is the existence of alternative representations for any terminating representation, relying on the fact that 0.111111... is the sum of the geometric series 2-1 + 2-2 + 2-3 + ... which is 1.

Binary numerals which neither terminate nor recur represent irrational numbers. For instance,

0.10100100010000100000100.... does have a pattern, but it is not a fixed-length recurring pattern, so the number is irrational

1.0110101000001001111001100110011111110... is the binary representation of , the square root of 2, another irrational. It has no discernible pattern, although a proof that is irrational requires more than this. See irrational number.

History of programming languagesThis article discusses the major developments in the history of programming languages. For a detailed timeline of events, see the timeline of programming languages.

[] Prehistory

The first programming languages predate the modern computer. From the first, the languages were codes. Herman Hollerith realized that he could encode information on punch cards when he observed that railroad train conductors would encode the appearance of the ticket holders on the train tickets using the position of punched holes on the tickets. Hollerith then proceeded to encode the 1890 census data on punch cards which he made the same size as the boxes for holding US currency. (The dollar bill was later downsized.)

Page 35: IT MODULE 1

The first computer codes were specialized for the applications. In the first decades of the twentieth century, numerical calculations were based on decimal numbers. Eventually it was realized that logic could be represented with numbers, as well as with words. For example, Alonzo Church was able to express the lambda calculus in a formulaic way. The Turing machine was an abstraction of the operation of a tape-marking machine, for example, in use at the telephone companies. However, unlike the lambda calculus, Turing's code does not serve well as a basis for higher-level languages — its principal use is in rigorous analyses of algorithmic complexity.

Like many "firsts" in history, the first modern programming language is hard to identify. From the start, the restrictions of the hardware defined the language. Punch cards allowed 80 columns, but some of the columns had to be used for a sorting number on each card. Fortran included some keywords which were the same as English words, such as "IF", "GOTO" (go to) and "CONTINUE". The use of a magnetic drum for memory meant that computer programs also had to be interleaved with the rotations of the drum. Thus the programs were more hardware dependent than today.

To some people the answer depends on how much power and human-readability is required before the status of "programming language" is granted. Jacquard looms and Charles Babbage's Difference Engine both had simple, extremely limited languages for describing the actions that these machines should perform. One can even regard the punch holes on a player piano scroll as a limited domain-specific programming language, albeit not designed for human consumption.

[] The 1940s

In the 1940s the first recognizably modern, electrically powered computers were created. The limited speed and memory capacity forced programmers to write hand tuned assembly language programs. It was soon discovered that programming in assembly language required a great deal of intellectual effort and was error-prone.

In 1948, Konrad Zuse [1] published a paper about his programming language Plankalkül. However, it was not implemented in his time and his original contributions were isolated from other developments.

Some important languages that were developed in this time period include:

1943 - Plankalkül (Konrad Zuse) 1943 - ENIAC coding system 1949 - C-10

[] The 1950s and 1960s

In the 1950s the first three modern programming languages whose descendants are still in widespread use today were designed:

Page 36: IT MODULE 1

FORTRAN , the "FORmula TRANslator, invented by John W. Backus et al.; LISP , the "LISt Processor", invented by John McCarthy et al.; COBOL , the COmmon Business Oriented Language, created by the Short Range

Committee, heavily influenced by Grace Hopper.

Another milestone in the late 1950s was the publication, by a committee of American and European computer scientists, of "a new language for algorithms"; the Algol 60 Report (the "ALGOrithmic Language"). This report consolidated many ideas circulating at the time and featured two key innovations:

The use of Backus-Naur Form (BNF) for describing the language's syntax. Nearly all subsequent programming languages have used a variant of BNF to describe the context-free portion of their syntax.

The introduction of lexical scoping for names in arbitrarily nested scopes.

Algol 60 was particularly influential in the design of later languages, some of which soon became more popular. The Burroughs large systems were designed to be programmed in an extended subset of Algol.

Some important languages that were developed in this time period include:

1951 - Regional Assembly Language 1952 - Autocode 1954 - FORTRAN 1958 - LISP 1958 - ALGOL 58 1959 - COBOL 1962 - APL 1962 - Simula 1964 - BASIC 1964 - PL/I

[] 1967-1978: establishing fundamental paradigms

The period from the late 1960s to the late 1970s brought a major flowering of programming languages. Most of the major language paradigms now in use were invented in this period:

Simula , invented in the late 1960s by Nygaard and Dahl as a superset of Algol 60, was the first language designed to support object-oriented programming.

Smalltalk (mid 1970s) provided a complete ground-up design of an object-oriented language.

C , an early systems programming language, was developed by Dennis Ritchie and Ken Thompson at Bell Labs between 1969 and 1973.

Prolog , designed in 1972 by Colmerauer, Roussel, and Kowalski, was the first logic programming language.

Page 37: IT MODULE 1

ML built a polymorphic type system (invented by Robin Milner in 1978) on top of Lisp, pioneering statically typed functional programming languages.

Each of these languages spawned an entire family of descendants, and most modern languages count at least one of them in their ancestry.

The 1960s and 1970s also saw considerable debate over the merits of "structured programming", which essentially meant programming without the use of GOTO. This debate was closely related to language design: some languages did not include GOTO, which forced structured programming on the programmer. Although the debate raged hotly at the time, nearly all programmers now agree that, even in languages that provide GOTO, it is bad style to use it except in rare circumstances. As a result, later generations of language designers have found the structured programming debate tedious and even bewildering.

Some important languages that were developed in this time period include:

1970 - Pascal 1970 - Forth 1972 - C 1972 - Smalltalk 1972 - Prolog 1973 - ML 1978 - SQL

[] The 1980s: consolidation, modules, performance

The 1980s were years of relative consolidation. C++ combined object-oriented and systems programming. The United States government standardized Ada, a systems programming language intended for use by defense contractors. In Japan and elsewhere, vast sums were spent investigating so-called "fifth generation" languages that incorporated logic programming constructs. The functional languages community moved to standardize ML and Lisp. Rather than inventing new paradigms, all of these movements elaborated upon the ideas invented in the previous decade.

However, one important new trend in language design was an increased focus on programming for large-scale systems through the use of modules, or large-scale organizational units of code. Modula, Ada, and ML all developed notable module systems in the 1980s. Module systems were often wedded to generic programming constructs---generics being, in essence, parameterized modules (see also parametric polymorphism).

Although major new paradigms for programming languages did not appear, many researchers expanded on the ideas of prior languages and adapted them to new contexts. For example, the languages of the Argus and Emerald systems adapted object-oriented programming to distributed systems.

Page 38: IT MODULE 1

The 1980s also brought advances in programming language implementation. The RISC movement in computer architecture postulated that hardware should be designed for compilers rather than for human assembly programmers. Aided by processor speed improvements that enabled increasingly aggressive compilation techniques, the RISC movement sparked greater interest in compilation technology for high-level languages.

Language technology continued along these lines well into the 1990s.

Some important languages that were developed in this time period include:

1983 - Ada 1983 - C++ 1985 - Eiffel 1987 - Perl 1989 - FL (Backus)

[] The 1990s: the Internet age

The rapid growth of the Internet in the mid-1990s was the next major historic event in programming languages. By opening up a radically new platform for computer systems, the Internet created an opportunity for new languages to be adopted. In particular, the Java programming language rose to popularity because of its early integration with the Netscape Navigator web browser, and various scripting languages achieved widespread use in developing customized applications for web servers. Neither of these developments represented much fundamental novelty in language design; for example, the design of Java was a more conservative version of ideas explored many years earlier in the Smalltalk community, but the widespread adoption of languages that supported features like garbage collection and strong static typing was a major change in programming practice.

Some important languages that were developed in this time period include:

1990 - Haskell 1990 - Python 1991 - Java 1993 - Ruby 1995 - PHP 2000 - C#

[] Current trends

Programming language evolution continues, in both industry and research. Some current directions:

Page 39: IT MODULE 1

Mechanisms for adding security and reliability verification to the language: extended static checking, information flow control, static thread safety.

Alternative mechanisms for modularity: mixins, delegates, aspects. Component-oriented software development. Metaprogramming , reflection or access to the abstract syntax tree Increased emphasis on distribution and mobility. Integration with databases, including XML and relational databases. Open Source as a developmental philosophy for languages, including the GNU

compiler collection and recent languages such as Python, Ruby, and Squeak. Support for Unicode so that source code (program text) is not restricted to those

characters contained in the ASCII character set; allowing, for example, use of non-Latin-based scripts or extended punctuation.

AlgorithmFrom Wikipedia, the free encyclopedia

• Find   out   more   about   navigating   Wikipedia   and   finding   information  •Jump to: navigation, search

In mathematics, computing, linguistics, and related disciplines, an algorithm is a finite list of well-defined instructions for accomplishing some task that, given an initial state, will terminate in a defined end-state.

The concept of an algorithm originated as a means of recording procedures for solving mathematical problems such as finding the common divisor of two numbers or multiplying two numbers. A partial formalization of the concept began with attempts to solve the Entscheidungsproblem (the "decision problem") that David Hilbert posed in 1928. Subsequent formalizations were framed as attempts to define "effective calculability" (cf Kleene 1943:274) or "effective method" (cf Rosser 1939:225); those formalizations included the Gödel-Herbrand-Kleene recursive functions of 1930, 1934 and 1935, Alonzo Church's lambda calculus of 1936, Emil Post's "Formulation I" of 1936, and Alan Turing's Turing machines of 1936-7 and 1939.

[] Etymology

Al-Khwārizmī, Persian astronomer and mathematician, wrote a treatise in Arabic in 825 AD, On Calculation with Hindu Numerals. (See algorism). It was translated into Latin in the 12th century as Algoritmi de numero Indorum,[1] which title was likely intended to mean "Algoritmi on the numbers of the Indians", where "Algoritmi" was the translator's rendition of the author's name; but people misunderstanding the title treated Algoritmi as a Latin plural and this led to the word "algorithm" (Latin algorismus) coming to mean "calculation method". The intrusive "th" is most likely due to a false cognate with the Greek αριθμος (arithmos) meaning "number".

Page 40: IT MODULE 1

Flowcharts are often used to graphically represent algorithms.

[] Why algorithms are necessary: an informal definition

No generally accepted formal definition of "algorithm" exists yet. We can, however, derive clues to the issues involved and an informal meaning of the word from the following quotation from Boolos and Jeffrey (1974, 1999):

"No human being can write fast enough, or long enough, or small enough to list all members of an enumerably infinite set by writing out their names, one after another, in some notation. But humans can do something equally useful, in the case of certain enumerably infinite sets: They can give explicit instructions for determining the nth member of the set, for arbitrary finite n. Such instructions are to be given quite explicitly, in a form in which they could be followed by a computing machine, or by a human who is capable of carrying out only very elementary operations on symbols" (boldface added, p. 19).

The words "enumerably infinite" mean "countable using integers perhaps extending to infinity". Thus Boolos and Jeffrey are saying that an algorithm implies instructions for a process that "creates" output integers from an arbitrary "input" integer or integers that, in theory, can be chosen from 0 to infinity. Thus we might expect an algorithm to be an algebraic equation such as y = m + n — two arbitrary "input variables" m and n that produce an output y. Unfortunately — as we see in Algorithm characterizations — the word algorithm implies much more than this, something on the order of (for our addition example):

Precise instructions (in language understood by "the computer") for a "fast, efficient, good" process that specifies the "moves" of "the computer" (machine or human, equipped with the necessary internally-contained information and capabilities) to find, decode, and then munch arbitrary input integers/symbols m and n, symbols + and = ... and (reliably, correctly, "effectively") produce, in a "reasonable" time, output-integer y at a specified place and in a specified format.

The concept of algorithm is also used to define the notion of decidability (logic). That notion is central for explaining how formal systems come into being starting from a small set of axioms and rules. In logic, the time that an algorithm requires to complete cannot be measured, as it is not apparently related with our customary physical dimension. From such uncertainties, that characterize ongoing work, stems the unavailability of a definition of algorithm that suits both concrete (in some sense) and abstract usage of the term.

For a detailed presentation of the various points of view around the definition of "algorithm" see Algorithm characterizations. For examples of simple addition algorithms specified in the detailed manner described in Algorithm characterizations, see Algorithm examples.

Page 41: IT MODULE 1

[] Formalization of algorithms

Algorithms are essential to the way computers process information, because a computer program is essentially an algorithm that tells the computer what specific steps to perform (in what specific order) in order to carry out a specified task, such as calculating employees’ paychecks or printing students’ report cards. Thus, an algorithm can be considered to be any sequence of operations that can be performed by a Turing-complete system. Authors who assert this thesis include Savage (1987) and Gurevich (2000):

"...Turing's informal argument in favor of his thesis justifies a stronger thesis: every algorithm can be simulated by a Turing machine" (Gurevich 2000:1) ...according to Savage [1987], "an algorithm is a computational process defined by a Turing machine."(Gurevich 2000:3)

Typically, when an algorithm is associated with processing information, data are read from an input source or device, written to an output sink or device, and/or stored for further processing. Stored data are regarded as part of the internal state of the entity performing the algorithm. In practice, the state is stored in a data structure, but an algorithm requires the internal data only for specific operation sets called abstract data types.

For any such computational process, the algorithm must be rigorously defined: specified in the way it applies in all possible circumstances that could arise. That is, any conditional steps must be systematically dealt with, case-by-case; the criteria for each case must be clear (and computable).

Because an algorithm is a precise list of precise steps, the order of computation will almost always be critical to the functioning of the algorithm. Instructions are usually assumed to be listed explicitly, and are described as starting 'from the top' and going 'down to the bottom', an idea that is described more formally by flow of control.

So far, this discussion of the formalization of an algorithm has assumed the premises of imperative programming. This is the most common conception, and it attempts to describe a task in discrete, 'mechanical' means. Unique to this conception of formalized algorithms is the assignment operation, setting the value of a variable. It derives from the intuition of 'memory' as a scratchpad. There is an example below of such an assignment.

For some alternate conceptions of what constitutes an algorithm see functional programming and logic programming .

[] Termination

Some writers restrict the definition of algorithm to procedures that eventually finish. In such a category Kleene places the "decision procedure or decision method or algorithm for the question" (Kleene 1952:136). Others, including Kleene, include procedures that could run forever without stopping; such a procedure has been called a "computational

Page 42: IT MODULE 1

method" (Knuth 1997:5) or "calculation procedure or algorithm" (Kleene 1952:137); however, Kleene notes that such a method must eventually exhibit "some object" (Kleene 1952:137).

Minsky makes the pertinent observation that if an algorithm hasn't terminated then we cannot answer the question "Will it terminate with the correct answer?":

"But if the length of the process is not known in advance, then 'trying' it may not be decisive, because if the process does go on forever — then at no time will we ever be sure of the answer" (Minsky 1967:105)

Thus the answer is: undecidable. We can never know, nor can we do an analysis beforehand to find out. The analysis of algorithms for their likelihood of termination is called Termination analysis. See Halting problem for more about this knotty issue.

In the case of non-halting computation method (calculation procedure) success can no longer be defined in terms of halting with a meaningful output. Instead, terms of success that allow for unbounded output sequences must be defined. For example, an algorithm that verifies if there are more zeros than ones in an infinite random binary sequence must run forever to be effective. If it is implemented correctly, however, the algorithm's output will be useful: for as long as it examines the sequence, the algorithm will give a positive response while the number of examined zeros outnumber the ones, and a negative response otherwise. Success for this algorithm could then be defined as eventually outputting only positive responses if there are actually more zeros than ones in the sequence, and in any other case outputting any mixture of positive and negative responses.

See the examples of (im-)"proper" subtraction at partial function for more about what can happen when an algorithm fails for certain of its input numbers — e.g. (i) non-termination, (ii) production of "junk" (output in the wrong format to be considered a number) or no number(s) at all (halt ends the computation with no output), (iii) wrong number(s), or (iv) a combination of these. Kleene proposed that the production of "junk" or failure to produce a number is solved by having the algorithm detect these instances and produce e.g. an error message (he suggested "0"), or preferably, force the algorithm into an endless loop (Kleene 1952:322). Davis does this to his subtraction algorithm — he fixes his algorithm in a second example so that it is proper subtraction (Davis 1958:12-15). Along with the logical outcomes "true" and "false" Kleene also proposes the use of a third logical symbol "u" — undecided (Kleene 1952:326) — thus an algorithm will always produce something when confronted with a "proposition". The problem of wrong answers must be solved with an independent "proof" of the algorithm e.g. using induction:

"We normally require auxiliary evidence for this (that the algorithm correctly defines a mu recursive function), e.g. in the form of an inductive proof that, for each argument value, the computation terminates with a unique value" (Minsky 1967:186)

Page 43: IT MODULE 1

[] Expressing algorithms

Algorithms can be expressed in many kinds of notation, including natural languages, pseudocode, flowcharts, and programming languages. Natural language expressions of algorithms tend to be verbose and ambiguous, and are rarely used for complex or technical algorithms. Pseudocode and flowcharts are structured ways to express algorithms that avoid many of the ambiguities common in natural language statements, while remaining independent of a particular implementation language. Programming languages are primarily intended for expressing algorithms in a form that can be executed by a computer, but are often used as a way to define or document algorithms.

There is a wide variety of representations possible and one can express a given Turing machine program as a sequence of machine tables (see more at finite state machine and state transition table), as flowcharts (see more at state diagram), or as a form of rudimentary machine code or assembly code called "sets of quadruples" (see more at Turing machine).

Sometimes it is helpful in the description of an algorithm to supplement small "flow charts" (state diagrams) with natural-language and/or arithmetic expressions written inside "block diagrams" to summarize what the "flow charts" are accomplishing.

Representations of algorithms are generally classed into three accepted levels of Turing machine description (Sipser 2006:157):

1 High-level description:

"...prose to describe an algorithm, ignoring the implementation details. At this level we do not need to mention how the machine manages its tape or head"

2 Implementation description:

"...prose used to define the way the Turing machine uses its head and the way that it stores data on its tape. At this level we do not give details of states or transition function"

3 Formal description:

Most detailed, "lowest level", gives the Turing machine's "state table". For an example of the simple algorithm "Add m+n" described in all three levels see Algorithm examples.

[] Implementation

Most algorithms are intended to be implemented as computer programs. However, algorithms are also implemented by other means, such as in a biological neural network (for example, the human brain implementing arithmetic or an insect looking for food), in an electrical circuit, or in a mechanical device

Page 44: IT MODULE 1

[] Algorithm analysis

As it happens, it is important to know how much of a particular resource (such as time or storage) is required for a given algorithm. Methods have been developed for the analysis of algorithms to obtain such quantitative answers; for example, the algorithm above has a time requirement of O(n), using the big O notation with n as the length of the list. At all times the algorithm only needs to remember two values: the largest number found so far, and its current position in the input list. Therefore it is said to have a space requirement of O(1).[2] (Note that the size of the inputs is not counted as space used by the algorithm.)

Different algorithms may complete the same task with a different set of instructions in less or more time, space, or effort than others. For example, given two different recipes for making potato salad, one may have peel the potato before boil the potato while the other presents the steps in the reverse order, yet they both call for these steps to be repeated for all potatoes and end when the potato salad is ready to be eaten.

The analysis and study of algorithms is a discipline of computer science, and is often practiced abstractly without the use of a specific programming language or implementation. In this sense, algorithm analysis resembles other mathematical disciplines in that it focuses on the underlying properties of the algorithm and not on the specifics of any particular implementation. Usually pseudocode is used for analysis as it is the simplest and most general representation.

[] Classes

There are various ways to classify algorithms, each with its own merits.

[] Classification by implementation

One way to classify algorithms is by implementation means.

Recursion or iteration: A recursive algorithm is one that invokes (makes reference to) itself repeatedly until a certain condition matches, which is a method common to functional programming. Iterative algorithms use repetitive constructs like loops and sometimes additional data structures like stacks to solve the given problems. Some problems are naturally suited for one implementation or the other. For example, towers of hanoi is well understood in recursive implementation. Every recursive version has an equivalent (but possibly more or less complex) iterative version, and vice versa.

Logical: An algorithm may be viewed as controlled logical deduction. This notion may be expressed as:

Algorithm = logic + control.[3]

Page 45: IT MODULE 1

The logic component expresses the axioms that may be used in the computation and the control component determines the way in which deduction is applied to the axioms. This is the basis for the logic programming paradigm. In pure logic programming languages the control component is fixed and algorithms are specified by supplying only the logic component. The appeal of this approach is the elegant semantics: a change in the axioms has a well defined change in the algorithm.

Serial or parallel or distributed: Algorithms are usually discussed with the assumption that computers execute one instruction of an algorithm at a time. Those computers are sometimes called serial computers. An algorithm designed for such an environment is called a serial algorithm, as opposed to parallel algorithms or distributed algorithms. Parallel algorithms take advantage of computer architectures where several processors can work on a problem at the same time, whereas distributed algorithms utilise multiple machines connected with a network. Parallel or distributed algorithms divide the problem into more symmetrical or asymmetrical subproblems and collect the results back together. The resource consumption in such algorithms is not only processor cycles on each processor but also the communication overhead between the processors. Sorting algorithms can be parallelized efficiently, but their communication overhead is expensive. Iterative algorithms are generally parallelizable. Some problems have no parallel algorithms, and are called inherently serial problems.

Deterministic or non-deterministic: Deterministic algorithms solve the problem with exact decision at every step of the algorithm whereas non-deterministic algorithm solve problems via guessing although typical guesses are made more accurate through the use of heuristics.

Exact or approximate: While many algorithms reach an exact solution, approximation algorithms seek an approximation that is close to the true solution. Approximation may use either a deterministic or a random strategy. Such algorithms have practical value for many hard problems.

[] Classification by design paradigm

Another way of classifying algorithms is by their design methodology or paradigm. There is a certain number of paradigms, each different from the other. Furthermore, each of these categories will include many different types of algorithms. Some commonly found paradigms include:

Divide and conquer. A divide and conquer algorithm repeatedly reduces an instance of a problem to one or more smaller instances of the same problem (usually recursively), until the instances are small enough to solve easily. One such example of divide and conquer is merge sorting. Sorting can be done on each segment of data after dividing data into segments and sorting of entire data can be obtained in conquer phase by merging them. A simpler variant of divide and

Page 46: IT MODULE 1

conquer is called decrease and conquer algorithm, that solves an identical subproblem and uses the solution of this subproblem to solve the bigger problem. Divide and conquer divides the problem into multiple subproblems and so conquer stage will be more complex than decrease and conquer algorithms. An example of decrease and conquer algorithm is binary search algorithm.

Dynamic programming . When a problem shows optimal substructure, meaning the optimal solution to a problem can be constructed from optimal solutions to subproblems, and overlapping subproblems, meaning the same subproblems are used to solve many different problem instances, a quicker approach called dynamic programming avoids recomputing solutions that have already been computed. For example, the shortest path to a goal from a vertex in a weighted graph can be found by using the shortest path to the goal from all adjacent vertices. Dynamic programming and memoization go together. The main difference between dynamic programming and divide and conquer is that subproblems are more or less independent in divide and conquer, whereas subproblems overlap in dynamic programming. The difference between dynamic programming and straightforward recursion is in caching or memoization of recursive calls. When subproblems are independent and there is no repetition, memoization does not help; hence dynamic programming is not a solution for all complex problems. By using memoization or maintaining a table of subproblems already solved, dynamic programming reduces the exponential nature of many problems to polynomial complexity.

The greedy method. A greedy algorithm is similar to a dynamic programming algorithm, but the difference is that solutions to the subproblems do not have to be known at each stage; instead a "greedy" choice can be made of what looks best for the moment. The greedy method extends the solution with the best possible decision (not all feasible decisions) at an algorithmic stage based on the current local optimum and the best decision (not all possible decisions) made in previous stage. It is not exhaustive, and does not give accurate answer to many problems. But when it works, it will be the fastest method. The most popular greedy algorithm is finding the minimal spanning tree as given by Kruskal.

Linear programming. When solving a problem using linear programming, specific inequalities involving the inputs are found and then an attempt is made to maximize (or minimize) some linear function of the inputs. Many problems (such as the maximum flow for directed graphs) can be stated in a linear programming way, and then be solved by a 'generic' algorithm such as the simplex algorithm. A more complex variant of linear programming is called integer programming, where the solution space is restricted to the integers.

Reduction . This technique involves solving a difficult problem by transforming it into a better known problem for which we have (hopefully) asymptotically optimal algorithms. The goal is to find a reducing algorithm whose complexity is not dominated by the resulting reduced algorithm's. For example, one selection algorithm for finding the median in an unsorted list involves first sorting the list (the expensive portion) and then pulling out the middle element in the sorted list (the cheap portion). This technique is also known as transform and conquer.

Page 47: IT MODULE 1

Search and enumeration. Many problems (such as playing chess) can be modeled as problems on graphs. A graph exploration algorithm specifies rules for moving around a graph and is useful for such problems. This category also includes search algorithms, branch and bound enumeration and backtracking.

The probabilistic and heuristic paradigm. Algorithms belonging to this class fit the definition of an algorithm more loosely.

1. Probabilistic algorithms are those that make some choices randomly (or pseudo-randomly); for some problems, it can in fact be proven that the fastest solutions must involve some randomness.

2. Genetic algorithms attempt to find solutions to problems by mimicking biological evolutionary processes, with a cycle of random mutations yielding successive generations of "solutions". Thus, they emulate reproduction and "survival of the fittest". In genetic programming, this approach is extended to algorithms, by regarding the algorithm itself as a "solution" to a problem.

3. Heuristic algorithms, whose general purpose is not to find an optimal solution, but an approximate solution where the time or resources are limited. They are not practical to find perfect solutions. An example of this would be local search, tabu search, or simulated annealing algorithms, a class of heuristic probabilistic algorithms that vary the solution of a problem by a random amount. The name "simulated annealing" alludes to the metallurgic term meaning the heating and cooling of metal to achieve freedom from defects. The purpose of the random variance is to find close to globally optimal solutions rather than simply locally optimal ones, the idea being that the random element will be decreased as the algorithm settles down to a solution.

[] Classification by field of study

See also: List of algorithms

Every field of science has its own problems and needs efficient algorithms. Related problems in one field are often studied together. Some example classes are search algorithms, sorting algorithms, merge algorithms, numerical algorithms, graph algorithms, string algorithms, computational geometric algorithms, combinatorial algorithms, machine learning, cryptography, data compression algorithms and parsing techniques.

Fields tend to overlap with each other, and algorithm advances in one field may improve those of other, sometimes completely unrelated, fields. For example, dynamic programming was originally invented for optimization of resource consumption in industry, but is now used in solving a broad range of problems in many fields.

[] Classification by complexity

See also: Complexity class

Page 48: IT MODULE 1

Algorithms can be classified by the amount of time they need to complete compared to their input size. There is a wide variety: some algorithms complete in linear time relative to input size, some do so in an exponential amount of time or even worse, and some never halt. Additionally, some problems may have multiple algorithms of differing complexity, while other problems might have no algorithms or no known efficient algorithms. There are also mappings from some problems to other problems. Owing to this, it was found to be more suitable to classify the problems themselves instead of the algorithms into equivalence classes based on the complexity of the best possible algorithms for them.

[] Legal issues

See also: Software patents for a general overview of the patentability of software, including computer-implemented algorithms.

Algorithms, by themselves, are not usually patentable. In the United States, a claim consisting solely of simple manipulations of abstract concepts, numbers, or signals do not constitute "processes" (USPTO 2006) and hence algorithms are not patentable (as in Gottschalk v. Benson). However, practical applications of algorithms are sometimes patentable. For example, in Diamond v. Diehr, the application of a simple feedback algorithm to aid in the curing of synthetic rubber was deemed patentable. The patenting of software is highly controversial, and there are highly criticized patents involving algorithms, especially data compression algorithms, such as Unisys' LZW patent.

Additionally, some cryptographic algorithms have export restrictions (see export of cryptography).

 This short section requires expansion.

Flowchart

A flowchart (also spelled flow-chart and flow chart) is a schematic representation of an algorithm or a process. A flowchart is one of the seven basic tools of quality control, which also includes the histogram, Pareto chart, check sheet, control chart, cause-and-effect diagram, and scatter diagram (see Quality Management Glossary). They are commonly used in business/economic presentations to help the audience visualize the content better, or to find flaws in the process. Alternatively, one can use Nassi-Shneiderman diagrams.

A flowchart is described as "cross-functional" when the page is divided into different "lanes" describing the control of different organizational units. A symbol appearing in a

Page 49: IT MODULE 1

particular "lane" is within the control of that organizational unit. This technique allows the analyst to locate the responsibility for performing an action or making a decision correctly, allowing the relationship between different organizational units with responsibility over a single process.

Computer architecture

A typical vision of a computer architecture as a series of abstraction layers: hardware, firmware, assembler, kernel, operating system and applications (see also Tanenbaum 79).

In computer engineering, computer architecture is the conceptual design and fundamental operational structure of a computer system. It is a blueprint and functional description of requirements (especially speeds and interconnections) and design implementations for the various parts of a computer — focusing largely on the way by which the central processing unit (CPU) performs internally and accesses addresses in memory.

It may also be defined as the science and art of selecting and interconnecting hardware components to create computers that meet functional, performance and cost goals.

Computer architecture comprises at least three main subcategories [1]

Instruction set architecture , or ISA, is the abstract image of a computing system that is seen by a machine language (or assembly language) programmer, including the instruction set, memory address modes, processor registers, and address and data formats.

Microarchitecture , also known as Computer organization is a lower level, more concrete, description of the system that involves how the constituent parts of the system are interconnected and how they interoperate in order to implement the ISA[2]. The size of a computer's cache for instance, is an organizational issue that generally has nothing to do with the ISA.

System Design which includes all of the other hardware components within a computing system such as:

1. system interconnects such as computer buses and switches 2. memory controllers and hierarchies 3. CPU off-load mechanisms such as direct memory access 4. issues like multi-processing.

Once both ISA and microarchitecture has been specified, the actual device needs to be designed into hardware. This design process is often called implementation.

Page 50: IT MODULE 1

Implementation is usually not considered architectural definition, but rather hardware design engineering.

Implementation can be further broken down into three pieces:

Logic Implementation/Design - where the blocks that were defined in the microarchitecture are implemented as logic equations.

Circuit Implementation/Design - where speed critical blocks or logic equations or logic gates are implemented at the transistor level.

Physical Implementation/Design - where the circuits are drawn out, the different circuit components are placed in a chip floor-plan or on a board and the wires connecting them are routed.

For CPUs, the entire implementation process is often called CPU design.

More specific usages of the term include more general wider-scale hardware architectures, such as cluster computing and Non-Uniform Memory Access (NUMA) architectures.

[] Design goals

The exact form of a computer system depends on the constraints and goals for which it was optimized. Computer architectures usually trade off standards, cost, memory capacity, latency and throughput. Sometimes other considerations, such as features, size, weight, reliability, expandability and power consumption are factors as well.

The most common scheme carefully chooses the bottleneck that most reduces the computer's speed. Ideally, the cost is allocated proportionally to assure that the data rate is nearly the same for all parts of the computer, with the most costly part being the slowest. This is how skillful commercial integrators optimize personal computers.

[] Cost

Generally cost is held constant, determined by either system or commercial requirements.

[] Performance

Computer performance is often described in terms of clock speed (usually in MHz or GHz). This refers to the cycles per second of the main clock of the CPU. However, this metric is somewhat misleading, as a machine with a higher clock rate may not necessarily have higher performance. As a result manufacturers have moved away from clock speed

Page 51: IT MODULE 1

as a measure of performance. Computer performance can also be measured with the amount of cache a processor contains. If the speed, MHz or GHz, were to be a car then the cache is the traffic light. No matter how fast the car goes it still will not hit that green traffic light. The more speed you have and the more cache you have the faster your processor is.

Modern CPUs can execute multiple instructions per clock cycle, which dramatically speeds up a program. Other factors influence speed, such as the mix of functional units, bus speeds, available memory, and the type and order of instructions in the programs being run.

There are two main types of speed, latency and throughput. Latency is the time between the start of a process and its completion. Throughput is the amount of work done per unit time. Interrupt latency is the guaranteed maximum response time of the system to an electronic event (e.g. when the disk drive finishes moving some data). Performance is affected by a very wide range of design choices — for example, adding cache usually makes latency worse (slower) but makes throughput better. Computers that control machinery usually need low interrupt latencies. These computers operate in a real-time environment and fail if an operation is not completed in a specified amount of time. For example, computer-controlled anti-lock brakes must begin braking almost immediately after they have been instructed to brake.

The performance of a computer can be measured using other metrics, depending upon its application domain. A system may be CPU bound (as in numerical calculation), I/O bound (as in a webserving application) or memory bound (as in video editing). Power consumption has become important in servers and portable devices like laptops.

Benchmarking tries to take all these factors into account by measuring the time a computer takes to run through a series of test programs. Although benchmarking shows strengths, it may not help one to choose a computer. Often the measured machines split on different measures. For example, one system might handle scientific applications quickly, while another might play popular video games more smoothly. Furthermore, designers have been known to add special features to their products, whether in hardware or software, which permit a specific benchmark to execute quickly but which do not offer similar advantages to other, more general tasks.

[] Power consumption

Power consumption is another design criteria that factors in the design of modern computers. Power efficiency can often be traded for performance or cost benefits. With the increasing power density of modern circuits as the number of transistors per chip scales (Moore's Law), power efficiency has increased in importance. Recent processor designs such as the Intel Core 2 put more emphasis on increasing power efficiency. Also, in the world of embedded computing, power efficiency has long been and remains the primary design goal next to performance.

Page 52: IT MODULE 1

[] Historical perspective

Early usage in computer context

The term “architecture” in computer literature can be traced to the work of Lyle R. Johnson and Frederick P. Brooks, Jr., members in 1959 of the Machine Organization department in IBM’s main research center. Johnson had occasion to write a proprietary research communication about Stretch, an IBM-developed supercomputer for Los Alamos Scientific Laboratory; in attempting to characterize his chosen level of detail for discussing the luxuriously embellished computer, he noted that his description of formats, instruction types, hardware parameters, and speed enhancements aimed at the level of “system architecture” – a term that seemed more useful than “machine organization.” Subsequently Brooks, one of the Stretch designers, started Chapter 2 of a book (Planning a Computer System: Project Stretch, ed. W. Buchholz, 1962) by writing, “Computer architecture, like other architecture, is the art of determining the needs of the user of a structure and then designing to meet those needs as effectively as possible within economic and technological constraints.” Brooks went on to play a major role in the development of the IBM System/360 line of computers, where “architecture” gained currency as a noun with the definition “what the user needs to know.” Later the computer world would employ the term in many less-explicit ways.

The first mention of the term architecture in the refereed computer literature is in a 1964 article describing the IBM System/360. [3] The article defines architecture as the set of “attributes of a system as seen by the programmer, i.e., the conceptual structure and functional behavior, as distinct from the organization of the data flow and controls, the logical design, and the physical implementation.” In the definition, the programmer perspective of the computer’s functional behavior is key. The conceptual structure part of an architecture description makes the functional behavior comprehensible, and extrapolatable to a range of use cases. Only later on did ‘internals’ such as “the way by which the CPU performs internally and accesses addresses in memory,” mentioned above, slip into the definition of computer architecture.

TYPICAL VERSION OF COMPUTER ARCHITECTURE

OS AND APPLICATION

KERNAL

ASSEMBLER

FIRMWARE

HARDWARE

Page 53: IT MODULE 1

Operating system

An operating system (OS) is the software that manages the sharing of the resources of a computer. An operating system processes raw system data and user input, and responds by allocating and managing tasks and internal system resources as a service to users and programs of the system. At the foundation of all system software, an operating system performs basic tasks such as controlling and allocating memory, prioritizing system requests, controlling input and output devices, facilitating networking and managing file systems. Most operating systems come with an application that provides a user interface for managing the operating system, such as a command line interpreter or graphical user interface. The operating system forms a platform for other system software and for application software. Windows, Mac OS X, and Linux are three of the most popular operating systems for personal computers.

[] Services

Main article: Kernel (computer science)

[] Process management

Every program running on a computer, being it is a service or an application, is a process. As long as a von Neumann architecture is used to build computers, only one process per CPU can be run at a time. Older microcomputer OSes such as MS-DOS did not attempt to bypass this limit, with the exception of interrupt processing, and only one process could be run under them (although DOS itself featured TSR as a very partial and not too easy to use solution). Mainframe operating systems have had multitasking capabilities since the early 1960s. Modern operating systems enable concurrent execution of many processes at once via multitasking even with one CPU. Process management is an operating system's way of dealing with running multiple processes. Since most computers contain one processor with one core, multitasking is done by simply switching processes quickly. Depending on the operating system, as more processes run, either each time slice will become smaller or there will be a longer delay before each process is given a chance to run. Process management involves computing and distributing CPU time as well as other resources. Most operating systems allow a process to be assigned a priority which affects its allocation of CPU time. Interactive operating systems also employ some level of feedback in which the task with which the user is working receives higher priority. Interrupt driven processes will normally run at a very high priority. In many systems there is a background process, such as the System Idle Process in Windows, which will run when no other process is waiting for the CPU.

[] Memory management

Page 54: IT MODULE 1

Current computer architectures arrange the computer's memory in a hierarchical manner, starting from the fastest registers, CPU cache, random access memory and disk storage. An operating system's memory manager coordinates the use of these various types of memory by tracking which one is available, which is to be allocated or deallocated and how to move data between them. This activity, usually referred to as virtual memory management, increases the amount of memory available for each process by making the disk storage seem like main memory. There is a speed penalty associated with using disks or other slower storage as memory – if running processes require significantly more RAM than is available, the system may start thrashing. This can happen either because one process requires a large amount of RAM or because two or more processes compete for a larger amount of memory than is available. This then leads to constant transfer of each process's data to slower storage.

Another important part of memory management is managing virtual addresses. If multiple processes are in memory at once, they must be prevented from interfering with each other's memory (unless there is an explicit request to utilise shared memory). This is achieved by having separate address spaces. Each process sees the whole virtual address space, typically from address 0 up to the maximum size of virtual memory, as uniquely assigned to it. The operating system maintains a page table that match virtual addresses to physical addresses. These memory allocations are tracked so that when a process terminates, all memory used by that process can be made available for other processes.

The operating system can also write inactive memory pages to secondary storage. This process is called "paging" or "swapping" – the terminology varies between operating systems.

It is also typical for operating systems to employ otherwise unused physical memory as a page cache; requests for data from a slower device can be retained in memory to improve performance. The operating system can also pre-load the in-memory cache with data that may be requested by the user in the near future; SuperFetch is an example of this.

[] Disk and file systems

All operating systems include support for a variety of file systems.

Modern file systems comprise a hierarchy of directories. While the idea is conceptually similar across all general-purpose file systems, some differences in implementation exist. Two noticeable examples of this are the character used to separate directories, and case sensitivity.

Unix demarcates its path components with a slash (/), a convention followed by operating systems that emulated it or at least its concept of hierarchical directories, such as Linux, Amiga OS and Mac OS X. MS-DOS also emulated this feature, but had already also adopted the CP/M convention of using slashes for additional options to commands, so instead used the backslash (\) as its component separator. Microsoft Windows continues with this convention; Japanese editions of Windows use ¥, and Korean editions use ₩

Page 55: IT MODULE 1

[citation needed]. Versions of Mac OS prior to OS X use a colon (:) for a path separator. RISC OS uses a period (.).

Unix and Unix-like operating systems allow for any character in file names other than the slash (including line feed (LF) and other control characters). Unix file names are case sensitive, which allows multiple files to be created with names that differ only in case. By contrast, Microsoft Windows file names are not case sensitive by default. Windows also has a larger set of punctuation characters that are not allowed in file names.

File systems may provide journaling, which provides safe recovery in the event of a system crash. A journaled file system writes information twice: first to the journal, which is a log of file system operations, then to its proper place in the ordinary file system. In the event of a crash, the system can recover to a consistent state by replaying a portion of the journal. In contrast, non-journaled file systems typically need to be examined in their entirety by a utility such as fsck or chkdsk. Soft updates is an alternative to journalling that avoids the redundant writes by carefully ordering the update operations. Log-structured file systems and ZFS also differ from traditional journaled file systems in that they avoid inconsistencies by always writing new copies of the data, eschewing in-place updates.

Many Linux distributions support some or all of ext2, ext3, ReiserFS, Reiser4, GFS, GFS2, OCFS, OCFS2, and NILFS. Linux also has full support for XFS and JFS, along with the FAT file systems, and NTFS.

Microsoft Windows includes support for FAT12, FAT16, FAT32, and NTFS. The NTFS file system is the most efficient and reliable of the four Windows file systems, and as of Windows Vista, is the only file system which the operating system can be installed on. Windows Embedded CE 6.0 introduced ExFAT, a file system suitable for flash drives.

Mac OS X supports HFS+ as its primary file system, and it supports several other file systems as well, including FAT16, FAT32, NTFS and ZFS.

Common to all these (and other) operating systems is support for file systems typically found on removable media. FAT12 is the file system most commonly found on floppy discs. ISO 9660 and Universal Disk Format are two common formats that target Compact Discs and DVDs, respectively. Mount Rainier is a newer extension to UDF supported by Linux 2.6 kernels and Windows-Vista that facilitates rewriting to DVDs in the same fashion as what has been possible with floppy disks.

[] Networking

Most current operating systems are capable of using the TCP/IP networking protocols. This means that one system can appear on a network of the other and share resources such as files, printers, and scanners using either wired or wireless connections.

Page 56: IT MODULE 1

Many operating systems also support one or more vendor-specific legacy networking protocols as well, for example, SNA on IBM systems, DECnet on systems from Digital Equipment Corporation, and Microsoft-specific protocols on Windows. Specific protocols for specific tasks may also be supported such as NFS for file access.

[] Security

Many operating systems include some level of security. Security is based on the two ideas that:

The operating system provides access to a number of resources, directly or indirectly, such as files on a local disk, privileged system calls, personal information about users, and the services offered by the programs running on the system;

The operating system is capable of distinguishing between some requesters of these resources who are authorized (allowed) to access the resource, and others who are not authorized (forbidden). While some systems may simply distinguish between "privileged" and "non-privileged", systems commonly have a form of requester identity, such as a user name. Requesters, in turn, divide into two categories:

o Internal security: an already running program. On some systems, a program once it is running has no limitations, but commonly the program has an identity which it keeps and is used to check all of its requests for resources.

o External security: a new request from outside the computer, such as a login at a connected console or some kind of network connection. To establish identity there may be a process of authentication. Often a username must be quoted, and each username may have a password. Other methods of authentication, such as magnetic cards or biometric data, might be used instead. In some cases, especially connections from the network, resources may be accessed with no authentication at all.

In addition to the allow/disallow model of security, a system with a high level of security will also offer auditing options. These would allow tracking of requests for access to resources (such as, "who has been reading this file?").

Security of operating systems has long been a concern because of highly sensitive data held on computers, both of a commercial and military nature. The United States Government Department of Defense (DoD) created the Trusted Computer System Evaluation Criteria (TCSEC) which is a standard that sets basic requirements for assessing the effectiveness of security. This became of vital importance to operating system makers, because the TCSEC was used to evaluate, classify and select computer systems being considered for the processing, storage and retrieval of sensitive or classified information.

Page 57: IT MODULE 1

[] Internal security

Internal security can be thought of as protecting the computer's resources from the programs concurrently running on the system. Most operating systems set programs running natively on the computer's processor, so the problem arises of how to stop these programs doing the same task and having the same privileges as the operating system (which is after all just a program too). Processors used for general purpose operating systems generally have a hardware concept of privilege. Generally less privileged programs are automatically blocked from using certain hardware instructions, such as those to read or write from external devices like disks. Instead, they have to ask the privileged program (operating system kernel) to read or write. The operating system therefore gets the chance to check the program's identity and allow or refuse the request.

An alternative strategy, and the only sandbox strategy available in systems that do not meet the Popek and Goldberg virtualization requirements, is the operating system not running user programs as native code, but instead either emulates a processor or provides a host for a p-code based system such as Java.

Internal security is especially relevant for multi-user systems; it allows each user of the system to have private files that the other users cannot tamper with or read. Internal security is also vital if auditing is to be of any use, since a program can potentially bypass the operating system, inclusive of bypassing auditing.

[] External security

Typically an operating system offers (or hosts) various services to other network computers and users. These services are usually provided through ports or numbered access points beyond the operating systems network address. Services include offerings such as file sharing, print services, email, web sites, and file transfer protocols (FTP), most of which can have compromised security.

At the front line of security are hardware devices known as firewalls or intrustion detection/prevention systems. At the operating system level, there are a number of software firewalls available, as well as intrusion detection/prevention systems. Most modern operating systems include a software firewall, which is enabled by default. A software firewall can be configured to allow or deny network traffic to or from a service or application running on the operating system. Therefore, one can install and be running an insecure service, such as Telnet or FTP, and not have to be threatened by a security breach because the firewall would deny all traffic trying to connect to the service on that port.

[] Graphical user interfaces

Today, most modern operating systems contain Graphical User Interfaces (GUIs, pronounced goo-eez). A few older operating systems tightly integrated the GUI to the kernel—for example, the original implementations of Microsoft Windows and Mac OS

Page 58: IT MODULE 1

The Graphical subsytem was actually part of the operating system. More modern operating systems are modular, separating the graphics subsystem from the kernel (as is now done in Linux, and Mac OS X) so that the graphics subsystem is not part of the OS at all.

Many operating systems allow the user to install or create any user interface they desire. The X Window System in conjunction with GNOME or KDE is a commonly found setup on most Unix and Unix derivative (BSD, Linux, Minix) systems.

Graphical user interfaces evolve over time. For example, Windows has modified its user interface almost every time a new major version of Windows is released, and the Mac OS GUI changed dramatically with the introduction of Mac OS X in 2001.

[] Device drivers

A device driver is a specific type of computer software developed to allow interaction with hardware devices. Typically this constitutes an interface for communicating with the device, through the specific computer bus or communications subsystem that the hardware is connected to, providing commands to and/or receiving data from the device, and on the other end, the requisite interfaces to the operating system and software applications. It is a specialized hardware-dependent computer program which is also operating system specific that enables another program, typically an operating system or applications software package or computer program running under the operating system kernel, to interact transparently with a hardware device, and usually provides the requisite interrupt handling necessary for any necessary asynchronous time-dependent hardware interfacing needs.

The key design goal of device drivers is abstraction. Every model of hardware (even within the same class of device) is different. Newer models also are released by manufacturers that provide more reliable or better performance and these newer models are often controlled differently. Computers and their operating systems cannot be expected to know how to control every device, both now and in the future. To solve this problem, OSes essentially dictate how every type of device should be controlled. The function of the device driver is then to translate these OS mandated function calls into device specific calls. In theory a new device, which is controlled in a new manner, should function correctly if a suitable driver is available. This new driver will ensure that the device appears to operate as usual from the operating systems' point of view for any person.

[] History

Main article: History of operating systems

The first computers did not have operating systems. By the early 1960s, commercial computer vendors were supplying quite extensive tools for streamlining the development,

Page 59: IT MODULE 1

scheduling, and execution of jobs on batch processing systems. Examples were produced by UNIVAC and Control Data Corporation, amongst others.

[] Mainframes

Through the 1960s, several major concepts were developed, driving the development of operating systems. The development of the IBM System/360 produced a family of mainframe computers available in widely differing capacities and price points, for which a single operating system OS/360 was planned (rather than developing ad-hoc programs for every individual model). This concept of a single OS spanning an entire product line was crucial for the success of System/360 and, in fact, IBM's current mainframe operating systems are distant descendants of this original system; applications written for the OS/360 can still be run on modern machines. OS/360 also contained another important advance: the development of the hard disk permanent storage device (which IBM called DASD). Another key development was the concept of time-sharing: the idea of sharing the resources of expensive computers amongst multiple computer users interacting in real time with the system. Time sharing allowed all of the users to have the illusion of having exclusive access to the machine; the Multics timesharing system was the most famous of a number of new operating systems developed to take advantage of the concept.

[] Midrange systems

Multics, particularly, was an inspiration to a number of operating systems developed in the 1970s, notably Unix by Dennis Ritchie and Ken Thompson. Another commercially-popular minicomputer operating system was VMS.

[] Microcomputer era

The first microcomputers did not have the capacity or need for the elaborate operating systems that had been developed for mainframes and minis; minimalistic operating systems were developed, often loaded from ROM and known as Monitors. One notable early disk-based operating system was CP/M, which was supported on many early microcomputers and was largely cloned in creating MS-DOS, which became wildly popular as the operating system chosen for the IBM PC (IBM's version of it was called IBM-DOS or PC-DOS), its successors making Microsoft one of the world's most profitable companies. The major alternative throughout the 1980s in the microcomputer market was Mac OS, tied intimately to the Apple Macintosh computer.

By the 1990s, the microcomputer had evolved to the point where, as well as extensive GUI facilities, the robustness and flexibility of operating systems of larger computers became increasingly desirable. Microsoft's response to this change was the development of Windows NT, which served as the basis for Microsoft's desktop operating system line starting in 2001. Apple rebuilt their operating system on top of a Unix core as Mac OS X, also released in 2001. Hobbyist-developed reimplementations of Unix, assembled with the tools from the GNU Project, also became popular; versions based on the Linux kernel

Page 60: IT MODULE 1

are by far the most popular, with the BSD derived UNIXes holding a small portion of the server market.

The growing complexity of embedded devices has led to increasing use of embedded operating systems.

[] Today

Modern operating systems usually feature a Graphical user interface (GUI) which uses a pointing device such as a mouse or stylus for input in addition to the keyboard. Older models and Operating Systems not designed for direct-human interaction (such as web-servers) generally use a Command line interface (or CLI) typically with only the keyboard for input. Both models are centered around a "shell" which accepts and processes commands from the user (eg. clicking on a button, or a typed command at a prompt).

The choice of OS may be dependant on the hardware architecture, specifically the CPU, with only Linux and BSD running on almost any CPU. Windows NT 3.1, which is no longer supported, was ported to the DEC Alpha and MIPS Magnum. Since the mid-1990s, the most commonly used operating systems have been the Microsoft Windows family, Linux, and other Unix-like operating systems, most notably Mac OS X. Mainframe computers and embedded systems use a variety of different operating systems, many with no direct connection to Windows or Unix. QNX and VxWorks are two common embedded operating systems, the latter being used in network infrastructure hardware equipment.

[] Personal computers

IBM PC compatible - Microsoft Windows, Unix variants, and Linux variants. Apple Macintosh - Mac OS X (a Unix variant), Windows (on x86 Macintosh

machines only), Linux and BSD

[] Mainframe computers

The earliest operating systems were developed for mainframe computer architectures in the 1960s. The enormous investment in software for these systems caused most of the original computer manufacturers to continue to develop hardware and operating systems that are compatible with those early operating systems. Those early systems pioneered many of the features of modern operating systems. Mainframe operating systems that are still supported include:

Burroughs MCP -- B5000,1961 to Unisys Clearpath/MCP, present. IBM OS/360 -- IBM System/360, 1964 to IBM zSeries, present UNIVAC EXEC 8 -- UNIVAC 1108, 1964, to Unisys Clearpath IX, present.

Page 61: IT MODULE 1

Modern mainframes typically also run Linux or Unix variants. A "Datacenter" variant of Windows Server 2003 is also available for some mainframe systems.

[] Embedded systems

Embedded systems use a variety of dedicated operating systems. In some cases, the "operating system" software is directly linked to the application to produce a monolithic special-purpose program. In the simplest embedded systems, there is no distinction between the OS and the application. Embedded systems that have certain time requirements are known as Real-time operating systems.

[] Unix-like operating systems

A customized KDE desktop running under Linux.

The Unix-like family is a diverse group of operating systems, with several major sub-categories including System V, BSD, and Linux. The name "UNIX" is a trademark of The Open Group which licenses it for use with any operating system that has been shown to conform to their definitions. "Unix-like" is commonly used to refer to the large set of operating systems which resemble the original Unix.

Unix systems run on a wide variety of machine architectures. They are used heavily as server systems in business, as well as workstations in academic and engineering environments. Free software Unix variants, such as Linux and BSD, are popular in these areas. Unix and Unix-like systems have not reached significant market share in the consumer and corporate desktop market, although there is some growth in this area, notably by the Ubuntu Linux distribution. Linux on the desktop is also popular in the developer and hobbyist operating system development communities. (see below)

Market share statistics for freely available operating systems are usually inaccurate since most free operating systems are not purchased, making usage under-represented. On the other hand, market share statistics based on total downloads of free operating systems are often inflated, as there is no economic disincentive to acquire multiple operating systems so users can download multiple, test them, and decide which they like best,

Some Unix variants like HP's HP-UX and IBM's AIX are designed to run only on that vendor's hardware. Others, such as Solaris, can run on multiple types of hardware, including x86 servers and PCs. Apple's Mac OS X, a hybrid kernel-based BSD variant derived from NeXTSTEP, Mach, and FreeBSD, has replaced Apple's earlier (non-Unix) Mac OS.

Page 62: IT MODULE 1

[] Open source

Over the past several years, the trend in the Unix and Unix-like space has been to open source operating systems. Many areas previously dominated by UNIX have seen significant inroads by Linux; Solaris source code is now the basis of the OpenSolaris project.

The team at Bell Labs that designed and developed Unix went on to develop Plan 9 and Inferno, which were designed for modern distributed environments. They had graphics built-in, unlike Unix counterparts that added it to the design later. Plan 9 did not become popular because, unlike many Unix distributions, it was not originally free. It has since been released under Free Software and Open Source Lucent Public License, and has an expanding community of developers. Inferno was sold to Vita Nuova and has been released under a GPL/MIT license.

[] Microsoft Windows

A Vista desktop launched for the first time.

The Microsoft Windows family of operating systems originated as a graphical layer on top of the older MS-DOS environment for the IBM PC. Modern versions are based on the newer Windows NT core that first took shape in OS/2 and borrowed from VMS. Windows runs on 32-bit and 64-bit Intel and AMD processors, although earlier versions also ran on the DEC Alpha, MIPS, Fairchild (later Intergraph) Clipper and PowerPC architectures (some work was done to port it to the SPARC architecture).

As of July 2007, Microsoft Windows held a large amount on the worldwide desktop market share, although some [Who says this?] predict this to dwindle due to Microsoft's restrictive licensing, (CD-Key) registration, and customer practices causing an increased interest in open source operating systems. Windows is also used on low-end and mid-range servers, supporting applications such as web servers and database servers. In recent years, Microsoft has spent significant marketing and research & development money to demonstrate that Windows is capable of running any enterprise application, which has resulted in consistent price/performance records (see the TPC) and significant acceptance in the enterprise market.

The most widely used version of the Microsoft Windows family is Microsoft Windows XP, released on October 25, 2001. The latest release of Windows XP is Windows XP Service Pack 2, released on August 6, 2004.

In November 2006, after more than five years of development work, Microsoft released Windows Vista, a major new version of Microsoft Windows which contains a large number of new features and architectural changes. Chief amongst these are a new user interface and visual style called Windows Aero, a number of new security features such

Page 63: IT MODULE 1

as User Account Control, and new multimedia applications such as Windows DVD Maker.

[] Mac OS X

Apple's Upcoming Mac OS X v10.5 ("Leopard")

Mac OS X is a line of proprietary, graphical operating systems developed, marketed, and sold by Apple Inc., the latest of which is pre-loaded on all currently shipping Macintosh computers. Mac OS X is the successor to the original Mac OS, which had been Apple's primary operating system since 1984. Unlike its predecessor, Mac OS X is a UNIX operating system built on technology that had been developed at NeXT through the second half of the 1980s and up until Apple purchased the company in early 1997.

The operating system was first released in 1999 as Mac OS X Server 1.0, with a desktop-oriented version (Mac OS X v10.0) following in March 2001. Since then, four more distinct "end-user" and "server" editions of Mac OS X have been released, the most recent being Mac OS X v10.4, which was first made available in April 2005. Releases of Mac OS X are named after big cats; Mac OS X v10.4 is usually referred to by Apple and users as "Tiger". In October 2007, Apple will release Mac OS X 10.5, nicknamed "Leopard".

The server edition, Mac OS X Server, is architecturally identical to its desktop counterpart but usually runs on Apple's line of Macintosh server hardware. Mac OS X Server includes workgroup management and administration software tools that provide simplified access to key network services, including a mail transfer agent, a Samba server, an LDAP server, a domain name server, and others.

[] Hobby operating system development

Operating system development, or OSDev for short, as a hobby has a large cult following. As such, operating systems, such as Linux, have derived from hobby operating system projects. The design and implementation of an operating system requires skill and determination, and the term can cover anything from a basic "Hello World" boot loader to a fully featured kernel.

[] Other

Mainframe operating systems, such as IBM's z/OS, and embedded operating systems such as VxWorks, eCos, and Palm OS, are usually unrelated to Unix and Windows, except for Windows CE, Windows NT Embedded 4.0 and Windows XP Embedded which are descendants of Windows, and several *BSDs, and Linux distributions tailored for embedded systems. OpenVMS from Hewlett-Packard (formerly DEC), is still under active development.

Page 64: IT MODULE 1

Older operating systems which are still used in niche markets include OS/2 from IBM; Mac OS, the non-Unix precursor to Apple's Mac OS X; BeOS; XTS-300.

Popular prior to the Dot COM era, operating systems such as AmigaOS and RISC OS continue to be developed as minority platforms for enthusiast communities and specialist applications.

Research and development of new operating systems continues. GNU Hurd is designed to be backwards compatible with Unix, but with enhanced functionality and a microkernel architecture. Singularity is a project at Microsoft Research to develop an operating system with better memory protection based on the .Net managed code model.2008

Unix

Unix (officially trademarked as UNIX®) is a computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs including Ken Thompson, Dennis Ritchie and Douglas McIlroy. Today's Unix systems are split into various branches, developed over time by AT&T as well as various commercial vendors and non-profit organizations.

As of 2007, the owner of the trademark UNIX® is The Open Group, an industry standards consortium. Only systems fully compliant with and certified to the Single UNIX Specification qualify as "UNIX®" (others are called "Unix system-like" or "Unix-like").

During the late 1970s and early 1980s, Unix's influence in academic circles led to large-scale adoption of Unix (particularly of the BSD variant, originating from the University of California, Berkeley) by commercial startups, the most notable of which is Sun Microsystems. Today, in addition to certified Unix systems, Unix-like operating systems such as Linux and BSD derivatives are commonly encountered.

Sometimes, "traditional Unix" may be used to describe a Unix or an operating system that has the characteristics of either Version 7 Unix or UNIX System V.

[] Overview

Unix operating systems are widely used in both servers and workstations. The Unix environment and the client-server program model were essential elements in the development of the Internet and the reshaping of computing as centered in networks rather than in individual computers.

Page 65: IT MODULE 1

Both Unix and the C programming language were developed by AT&T and distributed to government and academic institutions, causing both to be ported to a wider variety of machine families than any other operating system. As a result, Unix became synonymous with "open systems".

Unix was designed to be portable, multi-tasking and multi-user in a time-sharing configuration. Unix systems are characterized by various concepts: the use of plain text for storing data; a hierarchical file system; treating devices and certain types of inter-process communication (IPC) as files; and the use of a large number of small programs that can be strung together through a command line interpreter using pipes, as opposed to using a single monolithic program that includes all of the same functionality. These concepts are known as the Unix philosophy.

Under Unix, the "operating system" consists of many of these utilities along with the master control program, the kernel. The kernel provides services to start and stop programs, handle the file system and other common "low level" tasks that most programs share, and, perhaps most importantly, schedules access to hardware to avoid conflicts if two programs try to access the same resource or device simultaneously. To mediate such access, the kernel was given special rights on the system and led to the division between user-space and kernel-space.

The microkernel tried to reverse the growing size of kernels and return to a system in which most tasks were completed by smaller utilities. In an era when a "normal" computer consisted of a hard disk for storage and a data terminal for input and output (I/O), the Unix file model worked quite well as most I/O was "linear". However, modern systems include networking and other new devices. Describing a graphical user interface driven by mouse control in an "event driven" fashion didn't work well under the old model. Work on systems supporting these new devices in the 1980s led to facilities for non-blocking I/O, forms of inter-process communications other than just pipes, as well as moving functionality such as network protocols out of the kernel.

[] History

A partial list of simultaneously running processes on a Unix system.

In the 1960s, the Massachusetts Institute of Technology, AT&T Bell Labs, and General Electric worked on an experimental operating system called Multics (Multiplexed Information and Computing Service), which was designed to run on the GE-645 mainframe computer. The aim was the creation of a commercial product, although this was never a great success. Multics was an interactive operating system with many novel capabilities, including enhanced security. The project did develop production releases, but initially these releases performed poorly.

Page 66: IT MODULE 1

AT&T Bell Labs pulled out and deployed its resources elsewhere. One of the developers on the Bell Labs team, Ken Thompson, continued to develop for the GE-645 mainframe, and wrote a game for that computer called Space Travel.[1] However, he found that the game was too slow on the GE machine and was expensive, costing $75 per execution in scarce computing time.[2]

Thompson thus re-wrote the game in assembly language for Digital Equipment Corporation's PDP-7 with help from Dennis Ritchie. This experience, combined with his work on the Multics project, led Thompson to start a new operating system for the PDP-7. Thompson and Ritchie led a team of developers, including Rudd Canaday, at Bell Labs developing a file system as well as the new multi-tasking operating system itself. They included a command line interpreter and some small utility programs.[3]

Editing a shell script using the ed editor. The dollar-sign at the top of the screen is the prompt printed by the shell. 'ed' is typed to start the editor, which takes over from that point on the screen downwards.

[] 1970s

In 1970 the project was named Unics, and could - eventually - support two simultaneous users. Brian Kernighan invented this name as a contrast to Multics; the spelling was later changed to Unix.

Up until this point there had been no financial support from Bell Labs. When the Computer Science Research Group wanted to use Unix on a much larger machine than the PDP-7, Thompson and Ritchie managed to trade the promise of adding text processing capabilities to Unix for a PDP-11/20 machine. This led to some financial support from Bell. For the first time in 1970, the Unix operating system was officially named and ran on the PDP-11/20. It added a text formatting program called roff and a text editor. All three were written in PDP-11/20 assembly language. Bell Labs used this initial "text processing system", made up of Unix, roff, and the editor, for text processing of patent applications. Roff soon evolved into troff, the first electronic publishing program with a full typesetting capability. The UNIX Programmer's Manual was published on November 3, 1971.

In 1973, Unix was rewritten in the C programming language, contrary to the general notion at the time "that something as complex as an operating system, which must deal with time-critical events, had to be written exclusively in assembly language" [4]. The migration from assembly language to the higher-level language C resulted in much more portable software, requiring only a relatively small amount of machine-dependent code to be replaced when porting Unix to other computing platforms.

AT&T made Unix available to universities and commercial firms, as well as the United States government under licenses. The licenses included all source code including the

Page 67: IT MODULE 1

machine-dependent parts of the kernel, which were written in PDP-11 assembly code. Copies of the annotated Unix kernel sources circulated widely in the late 1970s in the form of a much-copied book by John Lions of the University of New South Wales, the Lions' Commentary on UNIX 6th Edition, with Source Code, which led to considerable use of Unix as an educational example.

Versions of the Unix system were determined by editions of its user manuals, so that (for example) "Fifth Edition UNIX" and "UNIX Version 5" have both been used to designate the same thing. Development expanded, with Versions 4, 5, and 6 being released by 1975. These versions added the concept of pipes, leading to the development of a more modular code-base, increasing development speed still further. Version 5 and especially Version 6 led to a plethora of different Unix versions both inside and outside Bell Labs, including PWB/UNIX, IS/1 (the first commercial Unix), and the University of Wollongong's port to the Interdata 7/32 (the first non-PDP Unix).

In 1978, UNIX/32V, for the VAX system, was released. By this time, over 600 machines were running Unix in some form. Version 7 Unix, the last version of Research Unix to be released widely, was released in 1979. Versions 8, 9 and 10 were developed through the 1980s but were only released to a few universities, though they did generate papers describing the new work. This research led to the development of Plan 9 from Bell Labs, a new portable distributed system.

[] 1980s

A late-80s style Unix desktop running the X Window System graphical user interface. Shown are a number of client applications common to the MIT X Consortium's distribution, including Tom's Window Manager, an X Terminal, Xbiff, xload, and a graphical manual page browser.

AT&T now licensed UNIX System III, based largely on Version 7, for commercial use, the first version launching in 1982. This also included support for the VAX. AT&T continued to issue licenses for older Unix versions. To end the confusion between all its differing internal versions, AT&T combined them into UNIX System V Release 1. This introduced a few features such as the vi editor and curses from the Berkeley Software Distribution of Unix developed at the University of California, Berkeley. This also included support for the Western Electric 3B series of machines.

Since the newer commercial UNIX licensing terms were not as favorable for academic use as the older versions of Unix, the Berkeley researchers continued to develop BSD Unix as an alternative to UNIX System III and V, originally on the PDP-11 architecture (the 2.xBSD releases, ending with 2.11BSD) and later for the VAX-11 (the 4.x BSD releases). Many contributions to Unix first appeared on BSD systems, notably the C shell with job control (modelled on ITS). Perhaps the most important aspect of the BSD development effort was the addition of TCP/IP network code to the mainstream Unix

Page 68: IT MODULE 1

kernel. The BSD effort produced several significant releases that contained network code: 4.1cBSD, 4.2BSD, 4.3BSD, 4.3BSD-Tahoe ("Tahoe" being the nickname of the Computer Consoles Inc. Power 6/32 architecture that was the first non-DEC release of the BSD kernel), Net/1, 4.3BSD-Reno (to match the "Tahoe" naming, and that the release was something of a gamble), Net/2, 4.4BSD, and 4.4BSD-lite. The network code found in these releases is the ancestor of much TCP/IP network code in use today, including code that was later released in AT&T System V UNIX and early versions of Microsoft Windows. The accompanying Berkeley Sockets API is a de facto standard for networking APIs and has been copied on many platforms.

Other companies began to offer commercial versions of the UNIX System for their own mini-computers and workstations. Most of these new Unix flavors were developed from the System V base under a license from AT&T; however, others were based on BSD instead. One of the leading developers of BSD, Bill Joy, went on to co-found Sun Microsystems in 1982 and create SunOS (now Solaris) for their workstation computers. In 1980, Microsoft announced its first Unix for 16-bit microcomputers called Xenix, which the Santa Cruz Operation (SCO) ported to the Intel 8086 processor in 1983, and eventually branched Xenix into SCO UNIX in 1989.

For a few years during this period (before PC compatible computers with MS-DOS became dominant), industry observers expected that UNIX, with its portability and rich capabilities, was likely to become the industry standard operating system for microcomputers.[5] In 1984 several companies established the X/Open consortium with the goal of creating an open system specification based on UNIX. Despite early progress, the standardization effort collapsed into the "Unix wars," with various companies forming rival standardization groups. The most successful Unix-related standard turned out to be the IEEE's POSIX specification, designed as a compromise API readily implemented on both BSD and System V platforms, published in 1988 and soon mandated by the United States government for many of its own systems.

AT&T added various features into UNIX System V, such as file locking, system administration, streams, new forms of IPC, the Remote File System and TLI. AT&T cooperated with Sun Microsystems and between 1987 and 1989 merged features from Xenix, BSD, SunOS, and System V into System V Release 4 (SVR4), independently of X/Open. This new release consolidated all the previous features into one package, and heralded the end of competing versions. It also increased licensing fees.

During this time a number of vendors including Digital Equipment, Sun, Addamax and others began building trusted versions of UNIX for high security applications, mostly designed for military and law enforcement applications.

The Common Desktop Environment or CDE, a graphical desktop for Unix co-developed in the 1990s by HP, IBM, and Sun as part of the COSE initiative.

Page 69: IT MODULE 1

[] 1990s

In 1990, the Open Software Foundation released OSF/1, their standard Unix implementation, based on Mach and BSD. The Foundation was started in 1988 and was funded by several Unix-related companies that wished to counteract the collaboration of AT&T and Sun on SVR4. Subsequently, AT&T and another group of licensees formed the group "UNIX International" in order to counteract OSF. This escalation of conflict between competing vendors gave rise again to the phrase "Unix wars".

In 1991, a group of BSD developers (Donn Seeley, Mike Karels, Bill Jolitz, and Trent Hein) left the University of California to found Berkeley Software Design, Inc (BSDI). BSDI produced a fully functional commercial version of BSD Unix for the inexpensive and ubiquitous Intel platform, which started a wave of interest in the use of inexpensive hardware for production computing. Shortly after it was founded, Bill Jolitz left BSDI to pursue distribution of 386BSD, the free software ancestor of FreeBSD, OpenBSD, and NetBSD.

By 1993 most commercial vendors had changed their variants of Unix to be based on System V with many BSD features added on top. The creation of the COSE initiative that year by the major players in Unix marked the end of the most notorious phase of the Unix wars, and was followed by the merger of UI and OSF in 1994. The new combined entity, which retained the OSF name, stopped work on OSF/1 that year. By that time the only vendor using it was Digital, which continued its own development, rebranding their product Digital UNIX in early 1995.

Shortly after UNIX System V Release 4 was produced, AT&T sold all its rights to UNIX® to Novell. (Dennis Ritchie likened this to the Biblical story of Esau selling his birthright for the proverbial "mess of pottage".[6]) Novell developed its own version, UnixWare, merging its NetWare with UNIX System V Release 4. Novell tried to use this to battle against Windows NT, but their core markets suffered considerably.

In 1993, Novell decided to transfer the UNIX® trademark and certification rights to the X/Open Consortium.[7] In 1996, X/Open merged with OSF, creating the Open Group. Various standards by the Open Group now define what is and what is not a "UNIX" operating system, notably the post-1998 Single UNIX Specification.

In 1995, the business of administering and supporting the existing UNIX licenses, plus rights to further develop the System V code base, were sold by Novell to the Santa Cruz Operation.[1] Whether Novell also sold the copyrights is currently the subject of litigation (see below).

In 1997, Apple Computer sought out a new foundation for its Macintosh operating system and chose NEXTSTEP, an operating system developed by NeXT. The core operating system was renamed Darwin after Apple acquired it. It was based on the BSD family and the Mach kernel. The deployment of Darwin BSD Unix in Mac OS X makes

Page 70: IT MODULE 1

it, according to a statement made by an Apple employee at a USENIX conference, the most widely used Unix-based system in the desktop computer market.

[] 2000 to present

A modern Unix desktop environment (Solaris 10)

In 2000, SCO sold its entire UNIX business and assets to Caldera Systems, which later on changed its name to The SCO Group. This new player then started legal action against various users and vendors of Linux. SCO have alleged that Linux contains copyrighted Unix code now owned by The SCO Group. Other allegations include trade-secret violations by IBM, or contract violations by former Santa Cruz customers who have since converted to Linux. However, Novell disputed the SCO group's claim to hold copyright on the UNIX source base. According to Novell, SCO (and hence the SCO Group) are effectively franchise operators for Novell, which also retained the core copyrights, veto rights over future licensing activities of SCO, and 95% of the licensing revenue. The SCO Group disagreed with this, and the dispute had resulted in the SCO v. Novell lawsuit.

In 2005, Sun Microsystems released the bulk of its Solaris system code (based on UNIX System V Release 4) into an open source project called OpenSolaris. New Sun OS technologies such as the ZFS file system are now first released as open source code via the OpenSolaris project; as of 2006 it has spawned several non-Sun distributions such as SchilliX, Belenix, Nexenta and MarTux.

The Dot-com crash has led to significant consolidation of Unix users as well. Of the many commercial flavors of Unix that were born in the 1980s, only Solaris, HP-UX, and AIX are still doing relatively well in the market, though SGI's IRIX persisted for quite some time. Of these, Solaris has the most market share, and may be gaining popularity due to its feature set and also since it now has an Open Source version.[8]

[] Standards

Beginning in the late 1980s, an open operating system standardization effort now known as POSIX provided a common baseline for all operating systems; IEEE based POSIX around the common structure of the major competing variants of the Unix system, publishing the first POSIX standard in 1988. In the early 1990s a separate but very similar effort was started by an industry consortium, the Common Open Software Environment (COSE) initiative, which eventually became the Single UNIX Specification administered by The Open Group). Starting in 1998 the Open Group and IEEE started the Austin Group, to provide a common definition of POSIX and the Single UNIX Specification.

In an effort towards compatibility, in 1999 several Unix system vendors agreed on SVR4's Executable and Linkable Format (ELF) as the standard for binary and object code

Page 71: IT MODULE 1

files. The common format allows substantial binary compatibility among Unix systems operating on the same CPU architecture.

The Filesystem Hierarchy Standard was created to provide a reference directory layout for Unix-like operating systems, particularly Linux. This type of standard however is controversial, and even within the Linux community its adoption is far from universal.

[] Components

See also: list of Unix programs

The Unix system is composed of several components that are normally packaged together. By including — in addition to the kernel of an operating system — the development environment, libraries, documents, and the portable, modifiable source-code for all of these components, Unix was a self-contained software system. This was one of the key reasons it emerged into an important teaching and learning tool and had such a broad influence.

Inclusion of these components did not make the system large — the original V7 UNIX distribution, consisting of copies of all of the compiled binaries plus all of the source code and documentation occupied less than 10Mb, and arrived on a single 9-track magtape. The printed documentation, typeset from the on-line sources, was contained in two volumes.

The names and filesystem locations of the Unix components has changed substantially across the history of the system. Nonetheless, the V7 implementation is considered by many to have the canonical early structure:

Kernel — source code in /usr/sys, composed of several sub-components: o conf — configuration and machine-dependent parts, including boot code o dev — device drivers for control of hardware (and some pseudo-hardware) o sys — operating system "kernel", handling memory management, process

scheduling, system calls, etc. o h — header files, defining key structures within the system and important

system-specific invariables Development Environment — Early versions of Unix contained a development

environment sufficient to recreate the entire system from source code: o cc — C language compiler (first appeared in V3 Unix) o as — machine-language assembler for the machine o ld — linker, for combining object files o lib — object-code libraries (installed in /lib or /usr/lib) libc, the system

library with C run-time support, was the primary library, but there have always been additional libraries for such things as mathematical functions (libm) or database access. V7 Unix introduced the first version of the modern "Standard I/O" library stdio as part of the system library. Later implementations increased the number of libraries significantly.

Page 72: IT MODULE 1

o make - build manager (introduced in PWB/UNIX), for effectively automating the build process

o include — header files for software development, defining standard interfaces and system invariants

o Other languages — V7 Unix contained a Fortran-77 compiler, a programmable arbitrary-precision calculator (bc, dc), and the awk "scripting" language, and later versions and implementations contain many other language compilers and toolsets. Early BSD releases included Pascal tools, and many modern Unix systems also include the GNU Compiler Collection as well as or instead of a proprietary compiler system.

o Other tools — including an object-code archive manager (ar), symbol-table lister (nm), compiler-development tools (e.g. lex & yacc), and debugging tools.

Commands — Unix makes little distinction between commands (user-level programs) for system operation and maintenance (e.g. cron), commands of general utility (e.g. grep), and more general-purpose applications such as the text formatting and typesetting package. Nonetheless, some major categories are:

o sh — The "shell" programmable command-line interpreter, the primary user interface on Unix before window systems appeared, and even afterward (within a "command window").

o Utilities — the core tool kit of the Unix command set, including cp, ls, grep, find and many others. Subcategories include:

System utilities — administrative tools such as mkfs, fsck, and many others

User utilities — environment management tools such as passwd, kill, and others.

o Document formatting — Unix systems were used from the outset for document preparation and typesetting systems, and included many related programs such as nroff, troff, tbl, eqn, refer, and pic. Some modern Unix systems also include packages such as TeX and GhostScript.

o Graphics — The plot subsystem provided facilities for producing simple vector plots in a device-independent format, with device-specific interpreters to display such files. Modern Unix systems also generally include X11 as a standard windowing system and GUI, and many support OpenGL.

o Communications — Early Unix systems contained no inter-system communication, but did include the inter-user communication programs mail and write. V7 introduced the early inter-system communication system UUCP, and systems beginning with BSD release 4.1c included TCP/IP utilities.

The 'man' command can display a 'man page' for every command on the system, including itself.

Page 73: IT MODULE 1

Documentation — Unix was the first operating system to include all of its documentation online in machine-readable form. The documentation included:

o man — manual pages for each command, library component, system call, header file, etc.

o doc — longer documents detailing major subsystems, such as the C language and troff

[] Impact

The Unix system had significant impact on other operating systems.

It was written in high level language as opposed to assembly language (which had been thought necessary for systems implementation on early computers). Although this followed the lead of Multics and Burroughs, it was Unix that popularized the idea.

Unix had a drastically simplified file model compared to many contemporary operating systems, treating all kinds of files as simple byte arrays. The file system hierarchy contained machine services and devices (such as printers, terminals, or disk drives), providing a uniform interface, but at the expense of occasionally requiring additional mechanisms such as ioctl and mode flags to access features of the hardware that did not fit the simple "stream of bytes" model. The Plan 9 operating system pushed this model even further and eliminated the need for additional mechanisms.

Unix also popularized the hierarchical file system with arbitrarily nested subdirectories, originally introduced by Multics. Other common operating systems of the era had ways to divide a storage device into multiple directories or sections, but they had a fixed number of levels, often only one level. Several major proprietary operating systems eventually added recursive subdirectory capabilities also patterned after Multics. DEC's RSX-11M's "group, user" hierarchy evolved into VMS directories, CP/M's volumes evolved into MS-DOS 2.0+ subdirectories, and HP's MPE group.account hierarchy and IBM's SSP and OS/400 library systems were folded into broader POSIX file systems.

Making the command interpreter an ordinary user-level program, with additional commands provided as separate programs, was another Multics innovation popularized by Unix. The Unix shell used the same language for interactive commands as for scripting (shell scripts — there was no separate job control language like IBM's JCL). Since the shell and OS commands were "just another program", the user could choose (or even write) his own shell. New commands could be added without changing the shell itself. Unix's innovative command-line syntax for creating chains of producer-consumer processes (pipelines) made a powerful programming paradigm (coroutines) widely available. Many later command-line interpreters have been inspired by the Unix shell.

A fundamental simplifying assumption of Unix was its focus on ASCII text for nearly all file formats. There were no "binary" editors in the original version of Unix — the entire system was configured using textual shell command scripts. The common denominator in the I/O system is the byte — unlike "record-based" file systems in other computers. The

Page 74: IT MODULE 1

focus on text for representing nearly everything made Unix pipes especially useful, and encouraged the development of simple, general tools that could be easily combined to perform more complicated ad hoc tasks. The focus on text and bytes made the system far more scalable and portable than other systems. Over time, text-based applications have also proven popular in application areas, such as printing languages (PostScript), and at the application layer of the Internet Protocols, e.g. Telnet, FTP, SSH, SMTP, HTTP and SIP.

Unix popularised a syntax for regular expressions that found widespread use. The Unix programming interface became the basis for a widely implemented operating system interface standard (POSIX, see above).

The C programming language soon spread beyond Unix, and is now ubiquitous in systems and applications programming.

Early Unix developers were important in bringing the theory of modularity and reusability into software engineering practice, spawning a "Software Tools" movement.

Unix provided the TCP/IP networking protocol on relatively inexpensive computers, which contributed to the Internet explosion of world-wide real-time connectivity, and which formed the basis for implementations on many other platforms. (This also exposed numerous security holes in the networking implementations.)

The Unix policy of extensive on-line documentation and (for many years) ready access to all system source code raised programmer expectations, contributing to the Open Source movement.

Over time, the leading developers of Unix (and programs that ran on it) evolved a set of cultural norms for developing software, norms which became as important and influential as the technology of Unix itself; this has been termed the Unix philosophy.

[] 2038

Main article: Year 2038 problem

Unix stores system time values as the number of seconds from midnight January 1, 1970 (the "Unix Epoch") in variables of type time_t, historically defined as "signed 32-bit integer". On January 19, 2038, the current time will roll over from a zero followed by 31 ones (01111111111111111111111111111111) to a one followed by 31 zeros (10000000000000000000000000000000), which will reset time to the year 1901 or 1970, depending on implementation, because that toggles the sign bit. As many applications use OS library routines for date calculations, the impact of this could be felt much earlier than 2038; for instance, 30-year mortgages may be calculated incorrectly beginning in the year 2008.

Page 75: IT MODULE 1

Since times before 1970 are rarely represented in Unix time, one possible solution that is compatible with existing binary formats would be to redefine time_t as "unsigned 32-bit integer". However, such a kludge merely postpones the problem to February 7, 2106, and could introduce bugs in software that compares differences between two sets of time.

Some Unix versions have already addressed this. For example, in Solaris on 64-bit systems, time_t is 64 bits long, meaning that the OS itself and 64-bit applications will correctly handle dates for some 292 billion years (several times greater than the age of the universe). Existing 32-bit applications using a 32-bit time_t continue to work on 64-bit Solaris systems but are still prone to the 2038 problem.

[] Free Unix-like operating systems

Linux is a modern Unix-like system

In 1983, Richard Stallman announced the GNU project, an ambitious effort to create a free software Unix-like system; "free" in that everyone who received a copy would be free to use, study, modify, and redistribute it. GNU's goal was achieved in 1992. Its own kernel development project, GNU Hurd, had not produced a working kernel, but a compatible kernel called Linux was released as free software in 1992 under the GNU General Public License. The combination of the two is frequently referred to simply as "Linux", although the Free Software Foundation and some Linux distributions, such as Debian GNU/Linux, use the combined term GNU/Linux. Work on GNU Hurd continues, although very slowly.

In addition to their use in the Linux operating system, many GNU packages — such as the GNU Compiler Collection (and the rest of the GNU toolchain), the GNU C library and the GNU core utilities — have gone on to play central roles in other free Unix systems as well.

Linux distributions, comprising Linux and large collections of compatible software have become popular both with hobbyists and in business. Popular distributions include Red Hat Enterprise Linux, SUSE Linux, Mandriva Linux, Fedora, Ubuntu, Debian GNU/Linux, Slackware Linux and Gentoo.

A free derivative of BSD Unix, 386BSD, was also released in 1992 and led to the NetBSD and FreeBSD projects. With the 1994 settlement of a lawsuit that UNIX Systems Laboratories brought against the University of California and Berkeley Software Design Inc. (USL v. BSDi), it was clarified that Berkeley had the right to distribute BSD Unix — for free, if it so desired. Since then, BSD Unix has been developed in several different directions, including the OpenBSD and DragonFly BSD variants.

Linux and the BSD kin are now rapidly occupying the market traditionally occupied by proprietary Unix operating systems, as well as expanding into new markets such as the

Page 76: IT MODULE 1

consumer desktop and mobile and embedded devices. A measure of this success may be seen when Apple Computer incorporated BSD into its Macintosh operating system by way of NEXTSTEP. Due to the modularity of the Unix design, sharing bits and pieces is relatively common; consequently, most or all Unix and Unix-like systems include at least some BSD code, and modern BSDs also typically include some GNU utilities in their distribution, so Apple's combination of parts from NeXT and FreeBSD with Mach and some GNU utilities has precedent.

In 2005, Sun Microsystems released the bulk of the source code to the Solaris operating system, a System V variant, under the name OpenSolaris, making it the first actively developed commercial Unix system to be open sourced (several years earlier, Caldera had released many of the older Unix systems under an educational and later BSD license). As a result, a great deal of formerly proprietary AT&T/USL code is now freely available.

[] Branding

See also: list of Unix systems

In October 1993, Novell, the company that owned the rights to the Unix System V source at the time, transferred the trademarks of Unix to the X/Open Company (now The Open Group),[9] and in 1995 sold the related business operations to Santa Cruz Operation.[10] Whether Novell also sold the copyrights to the actual software is currently the subject of litigation in a federal lawsuit, SCO v. Novell. Unix vendor SCO Group Inc. accused Novell of slander of title.

The present owner of the trademark UNIX® is The Open Group, an industry standards consortium. Only systems fully compliant with and certified to the Single UNIX Specification qualify as "UNIX®" (others are called "Unix system-like" or "Unix-like"). The term UNIX is not an acronym, but follows the early convention of naming computer systems in capital letters, such as ENIAC and MISTIC.

By decree of The Open Group, the term "UNIX®" refers more to a class of operating systems than to a specific implementation of an operating system; those operating systems which meet The Open Group's Single UNIX Specification should be able to bear the UNIX® 98 or UNIX® 03 trademarks today, after the operating system's vendor pays a fee to The Open Group. Systems licensed to use the UNIX® trademark include AIX, HP-UX, IRIX, Solaris, Tru64, A/UX, Mac OS X 10.5 on Intel platforms[11], and a part of z/OS.

Sometimes a representation like "Un*x", "*NIX", or "*N?X" is used to indicate all operating systems similar to Unix. This comes from the use of the "*" and "?" characters as "wildcard" characters in many utilities. This notation is also used to describe other Unix-like systems, e.g. Linux, FreeBSD, etc., that have not met the requirements for UNIX® branding from the Open Group.

Page 77: IT MODULE 1

The Open Group requests that "UNIX®" is always used as an adjective followed by a generic term such as "system" to help avoid the creation of a genericized trademark.

The term "Unix" is also used, and in fact was the original capitalisation, but the name UNIX stuck because, in the words of Dennis Ritchie "when presenting the original Unix paper to the third Operating Systems Symposium of the American Association for Computing Machinery, we had just acquired a new typesetter and were intoxicated by being able to produce small caps" (quoted from the Jargon File, version 4.3.3, 20 September 2002). Additionally, it should be noted that many of the operating system's predecessors and contemporaries used all-uppercase lettering, because many computer terminals of the time could not produce lower-case letters, so many people wrote the name in upper case due to force of habit.

Several plural forms of Unix are used to refer to multiple brands of Unix and Unix-like systems. Most common is the conventional "Unixes", but the hacker culture which created Unix has a penchant for playful use of language, and "Unices" (treating Unix as Latin noun of the third declension) is also popular. The Anglo-Saxon plural form "Unixen" is not common, although occasionally seen.

Trademark names can be registered by different entities in different countries and trademark laws in some countries allow the same trademark name to be controlled by two different entities if each entity uses the trademark in easily distinquishable categories. The result is that Unix has been used as a brand name for various products including book shelves, ink pens, bottled glue, diapers, hair driers and food containers. [2].

Microsoft WindowsMicrosoft Windows is the name of several families of software operating systems by Microsoft. Microsoft first introduced an operating environment named Windows in November 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces (GUIs).[1] Microsoft Windows eventually came to dominate the world's personal computer market, overtaking OS/2 and Mac OS which had been introduced previously. At the 2004 IDC Directions conference, IDC Vice President Avneesh Saxena stated that Windows had approximately 90% of the client operating system market.[2] The current client version of Windows are the editions of Windows Vista. The current server versions of Windows are the editions of Windows Server 2003, but Windows Server 2008 is already in Beta.

[] Versions

See also: List of Microsoft Windows versions

Page 78: IT MODULE 1

The term Windows collectively describes any or all of several generations of Microsoft (MS) operating system (OS) products. These products are generally categorized as follows:

[] 16-bit operating environments

The box art of Windows 1.0, the first version Microsoft released to the public.

The early versions of Windows were often thought of as just graphical user interfaces, mostly because they ran on top of MS-DOS and used it for file system services.[citation needed]

However even the earliest 16-bit Windows versions already assumed many typical operating system functions, notably having their own executable file format and providing their own device drivers (timer, graphics, printer, mouse, keyboard and sound) for applications. Unlike MS-DOS, Windows allowed users to execute multiple graphical applications at the same time, through cooperative multitasking. Finally, Windows implemented an elaborate, segment-based, software virtual memory scheme which allowed it to run applications larger than available memory: code segments and resources were swapped in and thrown away when memory became scarce, and data segments moved in memory when a given application had relinquished processor control, typically waiting for user input.[citation needed] 16-bit Windows versions include Windows 1.0 (1985), Windows 2.0 (1987) and its close relative Windows/286.

[] Hybrid 16/32-bit operating environments

A classic Windows logo. Was used from the early 1990s to 1999.

Windows/386 introduced a 32-bit protected mode kernel and virtual machine monitor. For the duration of a Windows session, it created one or more virtual 8086 environments and provided device virtualization for the video card, keyboard, mouse, timer and interrupt controller inside each of them. The user-visible consequence was that it became possible to preemptively multitask multiple MS-DOS environments in separate Windows (graphical applications required switching the window to full screen mode). Windows applications were still multi-tasked cooperatively inside one of such real-mode environments.

Windows 3.0 (1990) and Windows 3.1 (1992) improved the design, mostly because of virtual memory and loadable virtual device drivers (VxDs) which allowed them to share arbitrary devices between multitasked DOS windows.[citation needed] Because of this, Windows applications could now run in 16-bit protected mode (when Windows was running in Standard or 386 Enhanced Mode), which gave them access to several megabytes of memory and removed the obligation to participate in the software virtual memory scheme. They still ran inside the same address space, where the segmented

Page 79: IT MODULE 1

memory provided a degree of protection, and multi-tasked cooperatively. For Windows 3.0, Microsoft also rewrote critical operations from C into assembly, making this release faster and less memory-hungry than its predecessors.[citation needed]

[] Hybrid 16/32-bit operating systems

The Windows logo that was used from late 1999 to 2001.

With the introduction of 32-bit Windows for Workgroups 3.11, Windows could finally stop relying on DOS for file management.[citation needed] Leveraging this, Windows 95 introduced Long File Names, reducing the 8.3 filename to the role of a boot loader. MS-DOS was now bundled with Windows; this notably made it (partially) aware of long file names when its utilities were run from within Windows. The most important novelty was the possibility of running 32-bit multi-threaded preemptively multitasked graphical programs. However, the necessity of keeping compatibility with 16-bit programs meant the GUI components were still 16-bit only and not fully reentrant, which resulted in reduced performance and stability.

There were three releases of Windows 95 (the first in 1995, then subsequent bug-fix versions in 1996 and 1997, only released to OEMs, which added extra features such as FAT32 support). Microsoft's next OS was Windows 98; there were two versions of this (the first in 1998 and the second, named "Windows 98 Second Edition", in 1999).[citation

needed] In 2000, Microsoft released Windows Me (Me standing for Millennium Edition), which used the same core as Windows 98 but adopted the visual appearance of Windows 2000, as well as a new feature called System Restore, allowing the user to set the computer's settings back to an earlier date.[citation needed] It was not a very well-received implementation, and many user problems occurred.[citation needed] Windows Me was considered a stopgap to the day both product lines would be seamlessly merged.[citation

needed] Microsoft left little time for Windows Me to become popular before announcing their next version of Windows which would be called Windows XP.

[] 32-bit operating systems

The Windows logo that was used from 2001 to November 2006.

This family of Windows systems was fashioned and marketed for higher reliability business use, and was unencumbered by any Microsoft DOS patrimony.[citation needed] The first release was Windows NT 3.1 (1993, numbered "3.1" to match the Windows version and to one-up OS/2 2.1, IBM's flagship OS co-developed by Microsoft and was Windows NT's main competitor at the time), which was followed by NT 3.5 (1994), NT 3.51 (1995), and NT 4.0 (1996); NT 4.0 was the first in this line to implement the Windows 95 user interface. Microsoft then moved to combine their consumer and business operating

Page 80: IT MODULE 1

systems. Their first attempt, Windows 2000, failed to meet their goals,[citation needed] and was released as a business system. The home consumer edition of Windows 2000, codenamed "Windows Neptune," ceased development and Microsoft released Windows Me in its place. However, MS-DOS still existed. The last and final version of MS-DOS, version 8.0, was released embedded into Windows Me. When Windows Me passes on, it will be the end of MS-DOS. Eventually "Neptune" was merged into their new project, Whistler, which later became Windows XP. Since then, a new business system, Windows Server 2003, has expanded the top end of the range, and the newly released Windows Vista will complete it. Windows CE, Microsoft's offering in the mobile and embedded markets, is also a true 32-bit operating system that offers various services for all sub-operating workstations.

[] 64-bit operating systems

The current Windows logo

Windows NT included support for several different platforms before the x86-based personal computer became dominant in the professional world. Versions of NT from 3.1 to 4.0 supported DEC Alpha and MIPS R4000, which were 64-bit processors, although the operating system treated them as 32-bit processors.

With the introduction of the Intel Itanium architecture, which is referred to as IA-64, Microsoft released new versions of Windows 2000 to support it. Itanium versions of Windows XP and Windows Server 2003 were released at the same time as their mainstream x86 (32-bit) counterparts. On April 25, 2005, Microsoft released Windows XP Professional x64 Edition and x64 versions of Windows Server 2003 to support the AMD64/Intel64 (or x64 in Microsoft terminology) architecture. Microsoft dropped support for the Itanium version of Windows XP in 2005. The modern 64-bit Windows family comprises Windows XP Professional x64 Edition for AMD64/Intel64 systems, and Windows Server 2003, in both Itanium and x64 editions. Windows Vista is the first end-user version of Windows that Microsoft has released simultaneously in 32-bit and x64 editions. Windows Vista does not support the Itanium architecture.

[] History

Main article: History of Microsoft Windows

Microsoft has taken two parallel routes in operating systems. One route has been the home user and the other has been the professional IT user. The dual route has generally led to the home versions with greater multimedia support and less functionality in networking and security, and professional versions with inferior multimedia support and better networking and security.

Page 81: IT MODULE 1

The first independent version of Microsoft Windows, version 1.0, released in November 1985, lacked a degree of functionality and achieved little popularity, and was to compete with Apple's own operating system.[citation needed] Windows 1.0 did not provide a complete operating system; rather, it extended MS-DOS. Microsoft Windows version 2.0 was released in November, 1987 and was slightly more popular than its predecessor. Windows 2.03 (release date January 1988) had changed the OS from tiled Windows to overlapping Windows. The result of this change led to Apple Computer filing a suit against Microsoft alleging infringement on Apple's copyrights.[citation needed]

A Windows for Workgroups 3.11 desktop.

Microsoft Windows version 3.0, released in 1990, was the first Microsoft Windows version to achieve broad commercial success, selling 2 million copies in the first six months.[citation needed] It featured improvements to the user interface and to multitasking capabilities. It received a facelift in Windows 3.1, made generally available on March 1, 1992. Windows 3.1 support ended on December 31, 2001.[4]

In July 1993, Microsoft released Windows NT based on a new kernel. NT was considered to be the professional OS and was the first Windows version to utilize preemptive multitasking.[citation needed]. Windows NT and the Windows DOS/9x based line would later be fused together to create Windows XP.

In August 1995, Microsoft released Windows 95, which made further changes to the user interface, and also used preemptive multitasking. Mainstream support for Windows 95 ended on December 31, 2000 and extended support for Windows 95 ended on December 31, 2001.[5]

The next in line was Microsoft Windows 98 released in June 1998. It was substantially criticized for its slowness and for its unreliability compared with Windows 95, but many of its basic problems were later rectified with the release of Windows 98 Second Edition in 1999.[citation needed] Mainstream support for Windows 98 ended on June 30, 2002 and extended support for Windows 98 ended on July 11, 2006.[6]

As part of its "professional" line, Microsoft released Windows 2000 in February 2000. The consumer version following Windows 98 was Windows Me (Windows Millennium Edition). Released in September 2000, Windows Me attempted to implement a number of new technologies for Microsoft: most notably publicized was "Universal Plug and Play." However, the OS was heavily criticized for its lack of compatibility and stability and it was even rated by PC World as the fourth worst product of all time.[7]

In October 2001, Microsoft released Windows XP, a version built on the Windows NT kernel that also retained the consumer-oriented usability of Windows 95 and its successors. This new version was widely praised in computer magazines.[8] It shipped in two distinct editions, "Home" and "Professional", the former lacking many of the

Page 82: IT MODULE 1

superior security and networking features of the Professional edition. Additionally, the first "Media Center" edition was released in 2002[9], with an emphasis on support for DVD and TV functionality including program recording and a remote control. Mainstream support for Windows XP will continue until April 14, 2009 and extended support will continue until April 8, 2014.[10]

In April 2003, Windows Server 2003 was introduced, replacing the Windows 2000 line of server products with a number of new features and a strong focus on security; this was followed in December 2005 by Windows Server 2003 R2.

On January 30, 2007 Microsoft released Windows Vista. It contains a number of new features, from a redesigned shell and user interface to significant technical changes, with a particular focus on security features. It is available in a number of different editions, more than any previous version of Windows. It has been subject to several criticisms.

[] Security

The Windows Security Center was introduced with Windows XP Service Pack 2.

Security has been a hot topic with Windows for many years, and even Microsoft itself has been the victim of security breaches. Consumer versions of Windows were originally designed for ease-of-use on a single-user PC without a network connection, and did not have security features built in from the outset. Windows NT and its successors are designed for security (including on a network) and multi-user PCs, but are not designed with Internet security in mind as much since, when it was first developed in the early 1990s, Internet use was less prevalent. These design issues combined with flawed code (such as buffer overflows) and the popularity of Windows means that it is a frequent target of worm and virus writers. In June 2005, Bruce Schneier's Counterpane Internet Security reported that it had seen over 1,000 new viruses and worms in the previous six months.[11]

Microsoft releases security patches through its Windows Update service approximately once a month (usually the second Tuesday of the month), although critical updates are made available at shorter intervals when necessary.[12] In Windows 2000 (SP3 and later), Windows XP and Windows Server 2003, updates can be automatically downloaded and installed if the user selects to do so. As a result, Service Pack 2 for Windows XP, as well as Service Pack 1 for Windows Server 2003, were installed by users more quickly than it otherwise might have been.[13]

[] Windows Defender

Windows Defender

Page 83: IT MODULE 1

On 6 January 2005, Microsoft released a beta version of Microsoft AntiSpyware, based upon the previously released Giant AntiSpyware. On 14 February 2006, Microsoft AntiSpyware became Windows Defender with the release of beta 2. Windows Defender is a freeware program designed to protect against spyware and other unwanted software. Windows XP and Windows Server 2003 users who have genuine copies of Microsoft Windows can freely download the program from Microsoft's web site, and Windows Defender ships as part of Windows Vista.[14]

[] Third-party analysis

In an article based on a report by Symantec,[15] internetnews.com has described Microsoft Windows as having the "fewest number of patches and the shortest average patch development time of the five operating systems it monitored in the last six months of 2006."[16] However, although the overall number of vulnerabilities found in MS Windows was lower than in the other operating systems, the number of vulnerabilities of high severity found in Windows was significantly greater—Windows: 12, Red Hat + Fedora: 2, Apple OS X: 1, HP-UX: 2, Solaris: 1.

A study conducted by Kevin Mitnick and marketing communications firm Avantgarde in 2004 found that an unprotected and unpatched Windows XP system with Service Pack 1 lasted only 4 minutes on the Internet before it was compromised, and an unprotected and also unpatched Windows Server 2003 system was compromised after being connected to the internet for 8 hours.[17] However, it is important to note that this study does not apply to Windows XP systems running the Service Pack 2 update (released in late 2004), which vastly improved the security of Windows XP. The computer that was running Windows XP Service Pack 2 was not compromised. The AOL National Cyber Security Alliance Online Safety Study of October 2004 determined that 80% of Windows users were infected by at least one spyware/adware product.[18] Much documentation is available describing how to increase the security of Microsoft Windows products. Typical suggestions include deploying Microsoft Windows behind a hardware or software firewall, running anti-virus and anti-spyware software, and installing patches as they become available through Windows Update.[citation

Windows lifecycle policy

Microsoft has stopped releasing updates and hotfixes for many old Windows operating systems, including all versions of Windows 9x and earlier versions of Windows NT. Support for Windows 98, Windows 98 Second Edition and Windows Me ended in July 11, 2006, and Extended Support for Windows NT 4.0 ended in December 31, 2004. Security updates were also discontinued for Windows XP 64-bit Edition after the release of the more recent Windows XP Professional x64 Edition.[citation needed] But most of the updates that Microsoft has released in the past can still be downloaded using Windows Update Catalog.[citation needed]

Page 84: IT MODULE 1

Windows 2000 is currently in the Extended Support Period, and this period will not end until July 13, 2010. Only security updates will be provided during Extended Support; indicating that no new service packs will be released for Windows 2000.

[] Emulation software

Emulation allows the use of some Windows applications without using Microsoft Windows. These include:

Wine - (Wine Is Not an Emulator) an almost-complete free software/open-source software implementation of the Windows API, allowing one to run most Windows applications on x86-based platforms, including GNU/Linux. Wine is technically not an emulator; an emulator effectively 'pretends' to be a different CPU, while Wine makes use of Windows-style APIs to 'simulate' the Windows environment directly.

CrossOver - A Wine package with licensed fonts. Its developers are regular contributors to Wine, and focus on Wine running officially supported applications.

Cedega - TransGaming Technologies' proprietary fork of Wine, which is designed specifically for running games written for Microsoft Windows under GNU/Linux.

ReactOS - An open-source OS that intends to run the same software as Windows, at an early alpha stage.

Darwine - This project intends to port and develop Wine as well as other supporting tools that will allow Darwin and Mac OS X users to run Microsoft Windows Applications, and to provide Win32 API compatibility at application source code level.

DOS.

DOS (from Disk Operating System) commonly refers to the family of closely related operating systems which dominated the IBM PC compatible market between 1981 and 1995 (or until about 2000, if Windows 9x systems are included): DR-DOS, FreeDOS, MS-DOS, Novell-DOS, OpenDOS, PC-DOS, PTS-DOS, ROM-DOS and several others. They are single user, single task systems. MS-DOS from Microsoft was the most widely used. These operating systems ran on IBM PC type hardware using the Intel x86 CPUs or their compatible cousins from other makers. MS-DOS, inspired by CP/M, is still common today and was the foundation for many of Microsoft's operating systems (from Windows 1.0 through Windows Me). MS-DOS was later used as the foundation for their operating systems.

Page 85: IT MODULE 1

[] History

MS-DOS (and the IBM PC-DOS which was licensed therefrom), and its predecessor, 86-DOS, was inspired by CP/M (Control Program / (for) Microcomputers) — which was the dominant disk operating system for 8-bit Intel 8080 and Zilog Z80 based microcomputers. It was first developed at Seattle Computer Products by Tim Paterson as a variant of CP/M-80 from Digital Research, but intended as an internal product for testing SCP's new 8086 CPU card for the S-100 bus. It did not run on the 8080 (or compatible) CPU needed for CP/M-80. Microsoft bought it from SCP allegedly for $50,000, made changes and licensed the result to IBM (sold as PC-DOS) for its new 'PC' using the 8088 CPU (internally the same as the 8086), and to many other hardware manufacturers. In the later case it was sold as MS-DOS.

Digital Research produced a compatible product known as "DR-DOS", which was eventually taken over (after a buyout of Digital Research) by Novell. This became "OpenDOS" for a while after the relevant division of Novell was sold to Caldera International, now called SCO. Later, the embedded division of Caldera was "spun off" as Lineo (later renamed Embedix), which in turn sold DR-DOS to a start-up called Device Logics, who now seem to call themselves DRDOS, Inc.

Only IBM-PCs were distributed with PC-DOS, whereas PC compatible computers from nearly all other manufacturers were distributed with MS-DOS. For the early years of this operating system family, PC-DOS was almost identical to MS-DOS.

Early versions of Microsoft Windows were little more than a graphical shell for DOS, and later versions of Windows were tightly integrated with MS-DOS. It is also possible to run DOS programs under OS/2 and Linux using virtual-machine emulators. Because of the long existence and ubiquity of DOS in the world of the PC-compatible platform, DOS was often considered to be the native operating system of the PC compatible platform.

There are alternative versions of DOS, such as FreeDOS and OpenDOS. FreeDOS appeared in 1994 due to Microsoft Windows 95, which differed from Windows 3.11 by being not a shell and dispensing with MS-DOS.[1]

[] Timeline

Microsoft bought non-exclusive rights for marketing 86-DOS in October 1980. In July 1981, Microsoft bought exclusive rights for 86-DOS (by now up to version 1.14) and renamed the operating system MS-DOS.

The first IBM branded version, PC-DOS 1.0, was released in August, 1981. It supported up to 640 kB of RAM [2] and four 160 kB 5.25" single sided floppy disks.

In May 1982, PC-DOS 1.1 added support for 320 kB double-sided floppy disks.

Page 86: IT MODULE 1

PC-DOS 2.0 and MS-DOS 2.0, released in March 1983, were the first versions to support the PC/XT and fixed disk drives (commonly referred to as hard disk drives). Floppy disk capacity was increased to 180 kB (single sided) and 360 kB (double sided) by using nine sectors per track instead of eight.

At the same time, Microsoft announced its intention to create a GUI for DOS. Its first version, Windows 1.0, was announced on November 1983, but was unfinished and did not interest IBM. By November 1985, the first finished version, Microsoft Windows 1.01, was released.

MS-DOS 3.0, released in September 1984, first supported 1.2Mb floppy disks and 32Mb hard disks. MS-DOS 3.1, released November that year, introduced network support.

MS-DOS 3.2, released in April 1986, was the first retail release of MS-DOS. It added support of 720 kB 3.5" floppy disks. Previous versions had been sold only to computer manufacturers who pre-loaded them on their computers, because operating systems were considered part of a computer, not an independent product.

MS-DOS 3.3, released in April 1987, featured logical disks. A physical disk could be divided into several partitions, considered as independent disks by the operating system. Support was also added for 1.44 MB 3.5" floppy disks.

MS-DOS 4.0, released in July 1988, supported disks up to 2 GB (disk sizes were typically 40-60 MB in 1988), and added a full-screen shell called DOSSHELL. Other shells, like Norton Commander and PCShell, already existed in the market. In November of 1988, Microsoft addressed many bugs in a service release, MS-DOS 4.01.

MS-DOS 5.0, released in April 1991, included the full-screen BASIC interpreter QBasic, which also provided a full-screen text editor (previously, MS-DOS had only a line-based text editor, edlin). A disk cache utility SmartDrive, undelete capabilities, and other improvements were also included. It had severe problems with some disk utilities, fixed later in MS-DOS 5.01, released later in the same year.

In March 1992, Microsoft released Windows 3.1, which became the first popular version of Microsoft Windows, with more than 1,000,000 purchasing the graphical user interface.

In March 1993, MS-DOS 6.0 was released. Following competition from Digital Research, Microsoft added a disk compression utility called DoubleSpace. At the time, typical hard disk sizes were about 200-400 MB, and many users badly needed more disk space. MS-DOS 6.0 also featured the disk defragmenter DEFRAG, backup program MSBACKUP, memory optimization with MEMMAKER, and rudimentary virus protection via MSAV.

As with versions 4.0 and 5.0, MS-DOS 6.0 turned out to be buggy. Due to complaints about loss of data, Microsoft released an updated version, MS-DOS 6.2, with an

Page 87: IT MODULE 1

improved DoubleSpace utility, a new disk check utility, SCANDISK (similar to fsck from Unix), and other improvements.

The next version of MS-DOS, 6.21 (released March 1994), appeared due to legal problems. Stac Electronics sued Microsoft and forced it to remove DoubleSpace from their operating system.

In May 1994, Microsoft released MS-DOS 6.22, with another disk compression package, DriveSpace, licensed from VertiSoft Systems.

MS-DOS 6.22 was the last stand-alone version of MS-DOS available to the general public. MS-DOS was removed from marketing by Microsoft on November 30, 2001. See the Microsoft Licensing Roadmap.

Microsoft also released versions 6.23 to 6.25 for banks and American military organizations. These versions introduced FAT32 support. Since then, MS-DOS exists only as a part of Microsoft Windows versions based on Windows 95 (Windows 98, Windows Me). The original release of Microsoft Windows 95 incorporates MS-DOS version 7.0.

IBM released its last commercial version of a DOS, IBM PC-DOS 7.0, in early 1995. It incorporated many new utilities such as anti-virus software, comprehensive backup programs, PCMCIA support, and DOS Pen extensions. Also added were new features to enhance available memory and disk space.

[] Accessing hardware under DOS

The operating system offers a hardware abstraction layer that allows development of character-based applications, but not for accessing most of the hardware, such as graphics cards, printers, or mice. This required programmers to access the hardware directly, resulting in each application having its own set of device drivers for each hardware peripheral. Hardware manufacturers would release specifications to ensure device drivers for popular applications were available.

[] DOS and other PC operating systems

Early versions of Microsoft Windows were shell programs that ran in DOS. Windows 3.11 extended the shell by going into protected mode and added 32-bit support. These were 16-bit/32-bit hybrids. Microsoft Windows 95 further reduced DOS to the role of the bootloader. Windows 98 and Windows Me were the last Microsoft OS to run on DOS. The DOS-based branch was eventually abandoned in favor of Windows NT, the first true 32-bit system that was the foundation for Windows XP and Windows Vista.

Windows NT, initially NT OS/2 3.0, was the result of a collaboration between Microsoft and IBM to develop a 32-bit operating system that had high hardware and software

Page 88: IT MODULE 1

portability. Because of the success of Windows 3.0, Microsoft changed the application programming interface to the extended Windows API, which caused a split between the two companies and a branch in the operating system. IBM would continue to work on OS/2 and OS/2 API, while Microsoft renamed its operating system Windows NT.

[] Reserved device names under DOS

There are reserved device names in DOS that cannot be used as filenames regardless of extension; these restrictions also affect several Windows versions, in some cases causing crashes and security vulnerabilities.

A partial list of these reserved names is: NUL:, COM1: or AUX:, COM2:, COM3:, COM4:, CON:, LPT1: or PRN:, LPT2:, LPT3:, and CLOCK$.

More recent versions of both MS-DOS and IBM-DOS allow reserved device names without the trailing colon; e.g., PRN refers to PRN:.

The NUL filename redirects to a null file, similar in function to the UNIX device /dev/null. It is best suited for being used in batch command files to discard unneeded output. If NUL is copied to a file that already exists, it will truncate the target file; otherwise, a zero byte file will be created. (Thus, copy NUL foo is functionally similar to the UNIX commands cat </dev/null >foo and cp /dev/null foo.) Naming a file as NUL, regardless of extension, could cause unpredictable behavior in most applications. Well-designed applications will generate an error stating that NUL is a DOS reserved filename; others generate the file but whatever the program saves is lost; finally, some applications may hang or leave the computer in an inconsistent state, requiring a reboot.

[] Drive naming scheme

Main article: Drive letter assignment

Under Microsoft's DOS operating system and its derivatives drives are referred to by identifying letters. Standard practice is to reserve "A" and "B" for floppy drives. On systems with only one floppy drive DOS permits the use of both letters for one drive, and DOS will ask to swap disks. This permits copying from floppy to floppy or having a program run from one floppy while having its data on another. Hard drives were originally assigned the letters "C" and "D". DOS could only support one active partition per drive. As support for more hard drives became available, this developed into assigning the active primary partition on each drive letters first, then making a second pass over the drives to allocate letters to logical drives in the extended partition, then making a third, which gives the other non-active primary partitions their names. (Always assumed, they exist and contain a DOS-readable file system.) Lastly, DOS allocate letters for CD-ROMs, RAM disks and other hardware. Letter assignments usually occur in the order of the drivers loaded, but the drivers can instruct DOS to assign a different letter.

Page 89: IT MODULE 1

An example is network drives, for which the driver will assign letters nearer the end of the alphabets.

Because DOS applications use these drive letters directly (unlike the /dev folder in Unix-like systems), they can be disrupted by adding new hardware that needs a drive letter. An example is the addition of a new hard drive with a primary partition to an original hard drive that contains logical drives in extended partitions. As primary partitions have higher priority than the logical drives, it will change drive letters in the configuration. Moreover, attempts to add a new hard drive with only logical drives in an extended partition would still disrupt the letters of RAM disks and CD-ROM drives. This problem persisted through the 9x versions of Windows until NT, which preserves the letters of existing drives until the user changes it.

[] DOS emulators

Under Linux it is possible to run copies of DOS and many of its clones under DOSEMU, a Linux-native virtual machine for running real mode programs. There are a number of other emulators for running DOS under various versions of UNIX, even on non-x86 platforms, such as DOSBox

DOS emulators are gaining popularity among Windows XP users because Windows XP system is incompatible with pure DOS. They are used to play 'abandoned games' made for DOS. One of the most famous emulators is DOSBox, designed for game-playing on modern operating systems. Another emulator ExDOS is designed for business. VDMSound is also popular on Windows XP for its GUI and sound support.

Microsoft Word

Microsoft Word is Microsoft's flagship word processing software. It was first released in 1983 under the name Multi-Tool Word for Xenix systems.[1] Versions were later written for several other platforms including IBM PCs running DOS (1983), the Apple Macintosh (1984), SCO UNIX, OS/2 and Microsoft Windows (1989). It is a component of the Microsoft Office system; however, it is also sold as a standalone product and included in Microsoft Works Suite. Beginning with the 2003 version, the branding was revised to emphasize Word's identity as a component within the Office suite: Microsoft began calling it Microsoft Office Word instead of merely Microsoft Word. The latest release is Word 2007.

[] History

[] Word 1981 to 1989

Page 90: IT MODULE 1

Many concepts and ideas of Word were brought from Bravo, the original GUI word processor developed at Xerox PARC. Bravo's creator Charles Simonyi left PARC to work for Microsoft in 1981. Simonyi hired Richard Brodie, who had worked with him on Bravo, away from PARC that summer.[2][3] On February 1, 1983, development on what was originally named Multi-Tool Word began.

Having renamed it Microsoft Word, Microsoft released the program October 25, 1983, for the IBM PC. Free demonstration copies of the application were bundled with the November 1983 issue of PC World, making it the first program to be distributed on-disk with a magazine.[1] However, it was not well received, and sales lagged behind those of rival products such as WordPerfect. [citation needed]

Word featured a concept of "What You See Is What You Get", or WYSIWYG, and was the first application with such features as the ability to display bold and italics text on an IBM PC.[1] Word made full use of the mouse, which was so unusual at the time that Microsoft offered a bundled Word-with-Mouse package. Although MS-DOS was a character-based system, Microsoft Word was the first word processor for the IBM PC that showed actual line breaks and typeface markups such as bold and italics directly on the screen while editing, although this was not a true WYSIWYG system because available displays did not have the resolution to show actual typefaces. Other DOS word processors, such as WordStar and WordPerfect, used simple text-only display with markup codes on the screen or sometimes, at the most, alternative colors.[4]

As with most DOS software, each program had its own, often complicated, set of commands and nomenclature for performing functions that had to be learned. For example, in Word for MS-DOS, a file would be saved with the sequence Escape-T-S: pressing Escape called up the menu box, T accessed the set of options for Transfer and S was for Save (the only similar interface belonged to Microsoft's own Multiplan spreadsheet). As most secretaries had learned how to use WordPerfect, companies were reluctant to switch to a rival product that offered few advantages. Desired features in Word such as indentation before typing (emulating the F4 feature in WordPerfect), the ability to block text to copy it before typing, instead of picking up mouse or blocking after typing, and a reliable way to have macros and other functions always replicate the same function time after time, were just some of Word's problems for production typing.

Word for Macintosh, despite the major differences in look and feel from the DOS version, was ported by Ken Shapiro with only minor changes from the DOS source code,[citation needed] which had been written with high-resolution displays and laser printers in mind although none were yet available to the general public. Following the introduction of LisaWrite and MacWrite, Word for Macintosh attempted to add closer WYSIWYG features into its package. After Word for Mac was released in 1985, it gained wide acceptance. There was no Word 2.0 for Macintosh; this was the first attempt to synchronize version numbers across platforms.

The second release of Word for Macintosh, named Word 3.0, was shipped in 1987. It included numerous internal enhancements and new features but was plagued with bugs.

Page 91: IT MODULE 1

Within a few months Word 3.0 was superseded by Word 3.01, which was much more stable. All registered users of 3.0 were mailed free copies of 3.01, making this one of Microsoft's most expensive mistakes up to that time. Word 4.0 was released in 1989.

[] Word 1990 to 1995

Microsoft Word 5.1a (Macintosh)

The first version of Word for Windows was released in 1989 at a price of 500 US dollars. With the release of Windows 3.0 the following year, sales began to pick up (Word for Windows 1.0 was designed for use with Windows 3.0, and its performance was poorer with the versions of Windows available when it was first released). The failure of WordPerfect to produce a Windows version proved a fatal mistake. It was version 2.0 of Word, however, that firmly established Microsoft Word as the market leader.[citation needed]

After MacWrite, Word for Macintosh never had any serious rivals, although programs such as Nisus Writer provided features such as non-contiguous selection which were not added until Word 2002 in Office XP. In addition, many users complained that major updates reliably came more than two years apart, too long for most business users at that time.

Word 5.1 for the Macintosh, released in 1992, was a popular word processor due to its elegance, relative ease of use, and feature set. However, version 6.0 for the Macintosh, released in 1994, was widely derided, unlike the Windows version. It was the first version of Word based on a common codebase between the Windows and Mac versions; many accused it of being slow, clumsy and memory intensive. The equivalent Windows version was also numbered 6.0 to coordinate product naming across platforms, despite the fact that the previous version was Word for Windows 2.0.

When Microsoft became aware of the Year 2000 problem, it released the entire version of DOS port of Microsoft Word 5.5 instead of getting people to pay for the update. As of March 2007, it is still available for download from Microsoft's web site.[5]

Microsoft Word 6.0 (Windows 98)

Word 6.0 was the second attempt to develop a common codebase version of Word. The first, code-named Pyramid, had been an attempt to completely rewrite the existing Word product. It was abandoned when it was determined that it would take the development team too long to rewrite and then catch up with all the new capabilities that could have been added in the same time without a rewrite. Proponents of Pyramid claimed it would have been faster, smaller, and more stable than the product that was eventually released for Macintosh, which was compiled using a beta version of Visual C++ 2.0 that targets

Page 92: IT MODULE 1

the Macintosh, so many optimizations have to be turned off (the version 4.2.1 of Office is compiled using the final version), and sometimes use the Windows API simulation library included.[1] Pyramid would have been truly cross-platform, with machine-independent application code and a small mediation layer between the application and the operating system.

More recent versions of Word for Macintosh are no longer ported versions of Word for Windows although some code is often appropriated from the Windows version for the Macintosh version.[citation needed]

Later versions of Word have more capabilities than just word processing. The Drawing tool allows simple desktop publishing operations such as adding graphics to documents. Collaboration, document comparison, multilingual support, translation and many other capabilities have been added over the years.[citation needed]

[] Word 97

Word 97 icon

Word 97 had the same general operating performance as later versions such as Word 2000. This was the first copy of Word featuring the "Office Assistant", which was an animated helper used in all Office programs.

[] Word 2007

Main article: Microsoft Office 2007

Word 2007 is the most recent version of Word. This release includes numerous changes, including a new XML-based file format, a redesigned interface, an integrated equation editor, bibliographic management, and support for structured documents. It also has contextual tabs, which are functionality specific only to the object with focus, and many other features like Live Preview (which enables you to view the document without making any permanent changes), Mini Toolbar, Super-tooltips, Quick Access toolbar, SmartArt, etc.

[] File formats

Although the familiar ".doc" extension has been used in many different versions of Word, it actually encompasses four distinct file formats:

1. Word for DOS 2. Word for Windows 1 and 2; Word 4 and 5 for Mac 3. Word 6 and Word 95; Word 6 for Mac 4. Word 97, 2000, 2002, and 2003; Word 98, 2001, and X for Mac

Page 93: IT MODULE 1

The newer ".docx" extension signifies Office Open XML and is used by Word 2007.

[] Binary formats and handling

Word document formats (.DOC) as of the early 2000s were a de facto standard of document file formats due to their popularity. Though usually just referred to as "Word document format", this term refers primarily to the range of formats used by default in Word version 2–2003. In addition to the default Word binary formats, there are actually a number of optional alternate file formats that Microsoft has used over the years. Rich Text Format (RTF) was an early effort to create a format for interchanging formatted text between applications. RTF remains an optional format for Word that retains most formatting and all content of the original document. Later, after HTML appeared, Word supported an HTML derivative as an additional full-fidelity roundtrip format similar to RTF, with the additional capability that the file could be viewed in a web browser. Word 2007 uses the new Microsoft Office Open XML format as its default format, but retains the older Word 97–2003 format as an option. It also supports (for output only) PDF and XPS format.

The document formats of the various versions change in subtle and not so subtle ways; formatting created in newer versions does not always survive when viewed in older versions of the program, nearly always because that capability does not exist in the previous version. Wordart also changed drastically in a recent version causing problems with documents that used it when moving in either direction. The DOC format's specifications are not available for public download but can be received by writing to Microsoft directly and signing an agreement.[6]

Microsoft Word 95-2003 implemented OLE (Object Linking and Embedding) to manage the structure of its file format, easily identifiable by the .doc extension. OLE behaves rather like a conventional hard drive filesystem, and is made up of several key components. Each word document is composed of so called "big blocks" which are almost always (but do not have to be) 512-byte chunks, hence a Word documents filesize will always be a multiple of 512. "Storages" are analogues of the directory on a disk drive, and point to other storages or "streams" which are similar to files on a disk. The text in a Word document is always contained in the "WordDocument" stream. The first big block in a Word document, known as the "header" block, provides important information as to the location of the major data structures in the document. "Property storages" provide metadata about the storages and streams in a .doc file, such as where it begins and its name and so forth. The "File information block" contains information about where the text in a word document starts, ends, what version of Word created the document and so forth. Needless to say, Word documents are far more complex than perhaps initially expected, perhaps necessarily, or in part to prevent third-parties designing interoperable applications.

People who do not use MS Office sometimes find it difficult to use a Word document. Various solutions have been created. Since the format is a de facto standard, many word processors such as AbiWord or OpenOffice.org Writer need file import and export filters

Page 94: IT MODULE 1

for Microsoft Word's document file format to compete. Furthermore, there is Apache Jakarta POI, which is an open-source Java library that aims to read and write Word's binary file. Most of this interoperability is achieved through reverse engineering since documentation of the Word 1.0-2003 file format, while available to partners, is not publicly released. The Word 2007 file format, however, is publicly documented.

For the last 10 years Microsoft has also made available freeware viewer programs for Windows that can read Word documents without a full version of the MS Word software.[2] Microsoft has also provided converters that enable different versions of Word to import and export to older Word versions and other formats and converters for older Word versions to read documents created in newer Word formats.[7] The whole Office product range is covered by the Office Converter Pack for Office 97–2003 and Office Compatibility Pack for Office 2000–2007 since the release of Office 2007.[8]

[] Microsoft Office Open XML

The aforementioned Word format is a binary format. Microsoft has moved towards an XML-based file format for their office applications with Office 2007: Microsoft Office Open XML. This format does not conform fully to standard XML. It is, however, publicly documented as Ecma standard 376. Public documentation of the default file format is a first for Word, and makes it considerably easier, though not trivial, for competitors to interoperate. Efforts to establish it as an ISO standard are also underway. Another XML-based, public file format supported by Word 2003 is WordprocessingML.

It is possible to write plugins permitting Word to read and write formats it does not natively support.

[] Features and flaws

[] Normal.dot

Normal.dot is the master template from which all Word documents are created. It is one of the most important files in Microsoft Word. It determines the margin defaults as well as the layout of the text and font defaults. Although normal.dot is already set with certain defaults, the user can change normal.dot to new defaults. This will change other documents which were created using the template, usually in unexpected ways.

[] Macros

Like other Microsoft Office documents, Word files can include advanced macros and even embedded programs. The language was originally WordBasic, but changed to Visual Basic for Applications as of Word 97. Recently .NET has become the preferred platform for Word programming.

Page 95: IT MODULE 1

This extensive functionality can also be used to run and propagate viruses in documents. The tendency for people to exchange Word documents via email, USB key, and floppy makes this an especially attractive vector. A prominent example is the Melissa worm, but countless others have existed in the wild. Some anti-virus software can detect and clean common macro viruses, and firewalls may prevent worms from transmitting themselves to other systems.

The first virus known to affect Microsoft Word documents was called the Concept virus, a relatively harmless virus created to demonstrate the possibility of macro virus creation.[citation needed]

[] Layout issues

As of Word 2007 for Windows (and Word 2004 for Macintosh), the program has been unable to handle ligatures defined in TrueType fonts: those ligature glyphs with Unicode codepoints may be inserted manually, but are not recognized by Word for what they are, breaking spellchecking, while custom ligatures present in the font are not accessible at all. Other layout deficiencies of Word include the inability to set crop marks or thin spaces. Various third-party workaround utilities have been developed.[9] Similarly, combining diacritics are handled poorly: Word 2003 has "improved support", but many diacritics are still misplaced, even if a precomposed glyph is present in the font. Additionally, as of Word 2002, Word does automatic font substitution when it finds a character in a document that does not exist in the font specified. It is impossible to deactivate this, making it very difficult to spot when a glyph used is missing from the font in use.

In Word 2004 for Macintosh, complex scripts support was inferior even to Word 97, and Word does not support Apple Advanced Typography features like ligatures or glyph variants. [3]

[] Bullets and numbering

Users report that Word's bulleting and numbering system is highly problematic. Particularly troublesome is Word's system for restarting numbering.[10] However, the Bullets and Numbering system has been significantly overhauled for Office 2007, which should reduce the severity of these problems.

[] Creating Tables

Users can also create tables in MS Word. Depending on the version you have, formulas can also be computed.

[] Versions

Page 96: IT MODULE 1

Microsoft Word 5.5 for DOS

Versions for MS-DOS include:

1983 November — Word 1 1985 — Word 2 1986 — Word 3 1987 — Word 4 aka Microsoft Word 4.0 for the PC 1989 — Word 5 1991 — Word 5.1 1991 — Word 5.5 1993 — Word 6.0

Versions for the Macintosh (Mac OS and Mac OS X) include:

1985 January — Word 1 for the Macintosh 1987 — Word 3 1989 — Word 4 1991 — Word 5 1993 — Word 6 1998 — Word 98 2000 — Word 2001, the last version compatible with Mac OS 9 2001 — Word v.X, the first version for Mac OS X only 2004 — Word 2004, part of Office 2004 for Mac 2008 — Word 2008, part of Office 2008 for Mac

Microsoft Word 1.0 for Windows 3.x

Versions for Microsoft Windows include:

1989 November — Word for Windows 1.0 for Windows 2.x, code-named "Opus" 1990 March — Word for Windows 1.1 for Windows 3.0, code-named "Bill the

Cat" 1990 June — Word for Windows 1.1a for Windows 3.1 1991 — Word for Windows 2.0, code-named "Spaceman Spiff" 1993 — Word for Windows 6.0, code named "T3" (renumbered "6" to bring

Windows version numbering in line with that of DOS version, Macintosh version and also WordPerfect, the main competing word processor at the time; also a 32-bit version for Windows NT only)

1995 — Word for Windows 95 (version 7.0) - included in Office 95 1997 — Word 97 (version 8.0) included in Office 97 1999 — Word 2000 (version 9.0) included in Office 2000 2001 — Word 2002 (version 10) included in Office XP

Page 97: IT MODULE 1

Word 2003 icon 2003 — Word 2003 (officially "Microsoft Office Word 2003") - (ver. 11)

included in Office 2003 2006 — Word 2007 (officially "Microsoft Office Word 2007") - (ver. 12)

included in Office 2007; released to businesses on November 30th 2006, released worldwide to consumers on January 30th 2007

Versions for SCO UNIX include:

Microsoft Word for UNIX Systems Release 5.1

Versions for OS/2 include:

1992 Microsoft Word for OS/2 version 1.1B

Microsoft ExcelMicrosoft Excel (full name Microsoft Office Excel) is a spreadsheet application written and distributed by Microsoft for Microsoft Windows and Mac OS. It features calculation and graphing tools which, along with aggressive marketing, have made Excel one of the most popular microcomputer applications to date. It is overwhelmingly the dominant spreadsheet application available for these platforms and has been so since version 5 in 1993 and its bundling as part of Microsoft Office.

[] History

Microsoft originally marketed a spreadsheet program called Multiplan in 1982, which was very popular on CP/M systems, but on MS-DOS systems it lost popularity to Lotus 1-2-3. This promoted development of a new spreadsheet called Excel which started with the intention to, in the words of Doug Klunder, 'do everything 1-2-3 does and do it better' . The first version of Excel was released for the Mac in 1985 and the first Windows version (numbered 2.0 to line-up with the Mac and bundled with a run-time Windows environment) was released in November 1987. Lotus was slow to bring 1-2-3 to Windows and by 1988 Excel had started to outsell 1-2-3 and helped Microsoft achieve the position of leading PC software developer. This accomplishment, dethroning the king of the software world, solidified Microsoft as a valid competitor and showed its future of developing graphical software. Microsoft pushed its advantage with regular new releases, every two years or so. The current version for the Windows platform is Excel 12, also called Microsoft Office Excel 2007. The current version for the Mac OS X platform is Microsoft Excel 2004.

Page 98: IT MODULE 1

Microsoft Excel 2.1 included a run-time version of Windows 2.1

Early in its life Excel became the target of a trademark lawsuit by another company already selling a software package named "Excel" in the finance industry. As the result of the dispute Microsoft was required to refer to the program as "Microsoft Excel" in all of its formal press releases and legal documents. However, over time this practice has been ignored, and Microsoft cleared up the issue permanently when they purchased the trademark to the other program. Microsoft also encouraged the use of the letters XL as shorthand for the program; while this is no longer common, the program's icon on Windows still consists of a stylized combination of the two letters, and the file extension of the default Excel format is .xls.

Excel 3.0 logo

Excel offers many user interface tweaks over the earliest electronic spreadsheets; however, the essence remains the same as in the original spreadsheet, VisiCalc: the cells are organized in rows and columns, and contain data or formulas with relative or absolute references to other cells.

Excel was the first spreadsheet that allowed the user to define the appearance of spreadsheets (fonts, character attributes and cell appearance). It also introduced intelligent cell recomputation, where only cells dependent on the cell being modified are updated (previous spreadsheet programs recomputed everything all the time or waited for a specific user command). Excel has extensive graphing capabilities.

When first bundled into Microsoft Office in 1993, Microsoft Word and Microsoft PowerPoint had their GUIs redesigned for consistency with Excel, the killer app on the PC at the time.

Excel 97 logo

Since 1993, Excel has included Visual Basic for Applications (VBA), a programming language based on Visual Basic which adds the ability to automate tasks in Excel and to provide user defined functions (UDF) for use in worksheets. VBA is a powerful addition to the application which, in later versions, includes a fully featured integrated development environment (IDE). Macro recording can produce VBA code replicating user actions, thus allowing simple automation of regular tasks. VBA allows the creation of forms and in-worksheet controls to communicate with the user. The language supports use (but not creation) of ActiveX (COM) DLL's; later versions add support for class modules allowing the use of basic object-oriented programming techniques.

Page 99: IT MODULE 1

The automation functionality provided by VBA has caused Excel to become a target for macro viruses. This was a serious problem in the corporate world until antivirus products began to detect these viruses. Microsoft belatedly took steps to prevent the misuse by adding the ability to disable macros completely, to enable macros when opening a workbook or to trust all macros signed using a trusted certificate.

Versions 5.0 to 9.0 of Excel contain various Easter eggs, although since version 10 Microsoft has taken measures to eliminate such undocumented features from their products.

[] Versions

'Excel 97' (8.0) being run on Windows XP

Microsoft Excel 2003 running under Windows XP Home Edition

Excel 2003 icon

Versions for Microsoft Windows include:

1987 Excel 2.0 for Windows 1990 Excel 3.0 1992 Excel 4.0 1993 Excel 5.0 (Office 4.2 & 4.3, also a 32-bit version for Windows NT only on

the PowerPc, DEC Alpha, and MIPS) 1995 Excel for Windows 95 (version 7.0) - included in Office 95 1997 Excel 97 - (version 8.0) included in Office 97 (x86 and also a DEC Alpha

version) 1999 Excel 2000 (version 9.0) included in Office 2000 2001 Excel 2002 (version 10) included in Office XP 2003 Excel 2003 (version 11) included in Office 2003 2007 Excel 2007 (version 12) included in Office 2007 Notice: There is no Excel 1.0, in order to avoid confusion with Apple versions. Notice: There is no Excel 6.0, because the Windows 95 version was launched with

Word 7. All the Office 95 & Office 4.X products have OLE 2 capacity - moving data automatically from various programmes - and Excel 7 should show that it was contemporary with Word 7.

Versions for the Apple Macintosh include:

1985 Excel 1.0 1988 Excel 1.5

Page 100: IT MODULE 1

1989 Excel 2.2 1990 Excel 3.0 1992 Excel 4.0 1993 Excel 5.0 (Office 4.X -- Motorola 68000 version and first PowerPC version) 1998 Excel 8.0 (Office '98) 2000 Excel 9.0 (Office 2001) 2001 Excel 10.0 (Office v. X) 2004 Excel 11.0 (part of Office 2004 for Mac) 2008 Excel 12.0 (part of Office 2008 for Mac)

Versions for OS/2 include:

1989 Excel 2.2 1991 Excel 3.0

[] File formats

Microsoft Excel up until 2007 version used a proprietary binary file format called Binary Interchange File Format (BIFF) as its primary format[1]. Excel 2007 uses Office Open XML as its primary file format, an XML-based container similar in design to XML-based format called "XML Spreadsheet" ("XMLSS"), first introduced in Excel 2002[2]. The latter format is not able to encode VBA macros.

Although supporting and encouraging the use of new XML-based formats as replacements, Excel 2007 is still backwards compatible with the traditional, binary, formats. In addition, most versions of Microsoft Excel are able to read CSV, DBF, SYLK, DIF, and other legacy formats.

[] Microsoft Excel 2007 Office Open XML formats

Main article: Office Open XML

Microsoft Excel 2007, along with the other products in the Microsoft Office 2007 suite, introduces a host of new file formats. These are part of the Office Open XML (OOXML) specification.

The new Excel 2007 formats are:

Excel Workbook (.xlsx) The default Excel 2007 workbook format. In reality a ZIP compressed archive with a directory structure of XML text documents. Functions as the primary replacement for the former binary .xls format, although it does not support Excel macros for security reasons.

Excel Macro-enabled Workbook (.xlsm) As Excel Workbook, but with macro support.

Excel Binary Workbook (.xlsb)

Page 101: IT MODULE 1

As Excel Macro-enabled Workbook, but storing information in binary form rather than XML documents for opening and saving documents more quickly and efficiently. Intended especially for very large documents with tens of thousands of rows, and/or several hundreds of columns.

Excel Macro-enabled Template (.xltm) A template document that forms a basis for actual workbooks, with macro support. The replacement for the old .xlt format.

Excel Add-in (.xlam) Excel add-in to add extra functionality and tools. Inherent macro support due to the file purpose.

[] Exporting and Migration of spreadsheets

API's are also provided to open excel spreadsheets in a variety of other applications and environments other than Microsoft Excel. These include opening excel documents on the web using either ActiveX controls,or plugins like the Adobe Flash Player. Attempts have also been made to be able to copy excel spreadsheets to web applications using comma-separated values.

[] Criticism

Due to Excel's foundation on floating point calculations, the statistical accuracy of Excel has been criticized[3][4][5][6], as has the lack of certain statistical tools. Excel proponents have responded that some of these errors represent edge cases and that the relatively few users who would be affected by these know of them and have workarounds and alternatives.[citation needed]

Excel incorrectly assumes that 1900 is a leap year [7] [8] . The bug originated from Lotus 1-2-3, and was implemented in Excel for the purpose of backward compatibility [9] . This legacy has later been carried over into Office Open XML file format. Excel also supports the second date format based on year 1904 epoch.

Microsoft Access.

Microsoft Office Access, previously known as Microsoft Access, is a relational database management system from Microsoft which combines the relational Microsoft Jet Database Engine with a graphical user interface. It is a member of the 2007 Microsoft Office system.

Access can use data stored in Access/Jet, Microsoft SQL Server, Oracle, or any ODBC-compliant data container. Skilled software developers and data architects use it to develop application software. Relatively unskilled programmers and non-programmer

Page 102: IT MODULE 1

"power users" can use it to build simple applications. It supports some object-oriented techniques but falls short of being a fully object-oriented development tool.

Access was also the name of a communications program from Microsoft, meant to compete with ProComm and other programs. This Access proved a failure and was dropped.[1] Years later Microsoft reused the name for its database software.

o

[] History

Access 1.1 manual.

Access version 1.0 was released in November 1992.

Microsoft specified the minimum operating system for Version 2.0 as Microsoft Windows v3.0 with 4 MB of RAM. 6 MB RAM was recommended along with a minimum of 8 MB of available hard disk space (14 MB hard disk space recommended). The product was shipped on seven 1.44 MB diskettes. The manual shows a 1993 copyright date.

The software worked well with very large records sets but testing showed some circumstances caused data corruption. For example, file sizes over 700 MB were problematic. (Note that most hard disks were smaller than 700 MB at the time this was in wide use.) The Getting Started manual warns about a number of circumstances where obsolete device drivers or incorrect configurations can cause data loss.

Access 2.0, running under Windows 95

Access' initial codename was Cirrus. This was developed before Visual Basic and the forms engine was called Ruby. Bill Gates saw the prototypes and decided that the Basic language component should be co-developed as a separate expandable application. This project was called Thunder. The two projects were developed separately as the underlying forms engines were incompatible with each other; however, these were merged together again after VBA.

[] Uses

Access is used by small businesses, within departments of large corporations, and hobby programmers to create ad hoc customized desktop systems for handling the creation and manipulation of data. Access can be used as a database for basic web based applications

Page 103: IT MODULE 1

hosted on Microsoft's Internet Information Services and utilizing Microsoft Active Server Pages ASP. Most typical web applications should use tools like ASP/Microsoft SQL Server or the LAMP stack.

Some professional application developers use Access for rapid application development, especially for the creation of prototypes and standalone applications that serve as tools for on-the-road salesmen. Access does not scale well if data access is via a network, so applications that are used by more than a handful of people tend to rely on Client-Server based solutions. However, an Access "front end" (the forms, reports, queries and VB code) can be used against a host of database backends, including JET (file-based database engine, used in Access by default), Microsoft SQL Server, Oracle, and any other ODBC-compliant product.

[] Features

One of the benefits of Access from a programmer's perspective is its relative compatibility with SQL (structured query language) —queries may be viewed and edited as SQL statements, and SQL statements can be used directly in Macros and VBA Modules to manipulate Access tables. In this case, "relatively compatible" means that SQL for Access contains many quirks, and as a result, it has been dubbed "Bill's SQL" by industry insiders. Users may mix and use both VBA and "Macros" for programming forms and logic and offers object-oriented possibilities.

MSDE (Microsoft SQL Server Desktop Engine) 2000, a mini-version of MS SQL Server 2000, is included with the developer edition of Office XP and may be used with Access as an alternative to the Jet Database Engine.

Unlike a complete RDBMS, the Jet Engine lacks database triggers and stored procedures. Starting in MS Access 2000 (Jet 4.0), there is a syntax that allows creating queries with parameters, in a way that looks like creating stored procedures, but these procedures are limited to one statement per procedure.[1] Microsoft Access does allow forms to contain code that is triggered as changes are made to the underlying table (as long as the modifications are done only with that form), and it is common to use pass-through queries and other techniques in Access to run stored procedures in RDBMSs that support these.

In ADP files (supported in MS Access 2000 and later), the database-related features are entirely different, because this type of file connects to a MSDE or Microsoft SQL Server, instead of using the Jet Engine. Thus, it supports the creation of nearly all objects in the underlying server (tables with constraints and triggers, views, stored procedures and UDF-s). However, only forms, reports, macros and modules are stored in the ADP file (the other objects are stored in the back-end database).

[] Development

Page 104: IT MODULE 1

Access allows relatively quick development because all database tables, queries, forms, and reports are stored in the database. For query development, Access utilizes the Query Design Grid, a graphical user interface that allows users to create queries without knowledge of the SQL programming language. In the Query Design Grid, users can "show" the source tables of the query and select the fields they want returned by clicking and dragging them into the grid. Joins can be created by clicking and dragging fields in tables to fields in other tables. Access allows users to view and manipulate the SQL code if desired.

Access 97 icon

The programming language available in Access is, as in other products of the Microsoft Office suite, Microsoft Visual Basic for Applications. Two database access libraries of COM components are provided: the legacy Data Access Objects (DAO), which was superseded for a time (but still accessible) by (ADO) ActiveX Data Objects however (DAO) has been reintroduced in the latest version, MS Access 2007.

Many developers who use Access use the Leszynski naming convention, though this is not universal; it is a programming convention, not a DBMS-enforced rule.[2] It is also made redundant by the fact Access categorises each object automatically and always shows the object type, by prefixing Table: or Query: before the object name when referencing a list of different database objects.

MS Access can be applied to small projects but scales poorly to larger projects involving multiple concurrent users because it is a desktop application, not a true client-server database. When a Microsoft Access database is shared by multiple concurrent users, processing speed suffers. The effect is dramatic when there are more than a few users or if the processing demands of any of the users are high. Access includes an Upsizing Wizard that allows users to upsize their database to Microsoft SQL Server if they want to move to a true client-server database. It is recommended to use Access Data Projects for most situations.

Since all database queries, forms, and reports are stored in the database, and in keeping with the ideals of the relational model, there is no possibility of making a physically structured hierarchy with them.

One recommended technique is to migrate to SQL Server and utilize Access Data Projects. This allows stored procedures, views, and constraints - which are greatly superior to anything found in Jet. Additionally this full client-server design significantly reduces corruption, maintenance and many performance problems.

Access 2003 icon

Page 105: IT MODULE 1

Access allows no relative paths when linking, so the development environment should have the same path as the production environment (though it is possible to write a "dynamic-linker" routine in VBA that can search out a certain back-end file by searching through the directory tree, if it can't find it in the current path). This technique also allows the developer to divide the application among different files, so some structure is possible.

[] Protection

If the database design needs to be secured to prevent from changes, Access databases can be locked/protected (and the source code compiled) by converting the database to an .MDE file. All changes to the database structure (tables, forms, macros, etc.) need to be made to the original MDB and then reconverted to MDE.

Some tools are available for unlocking and 'decompiling', although certain elements including original VBA comments and formatting are normally irretrievable.

[] File extensions

Microsoft Access saves information under the following file extensions:

.mdb - Access Database (2003 and earlier)

.mde - Protected Access Database, with compiled macros (2003 and earlier)

.accdb - Access Database (2007)

.mam - Access Macro

.maq - Access Query

.mar - Access Report

.mat - Access Table

.maf - Access Form

.adp - Access Project

.adn - Access Blank Project Template

Microsoft PowerPoint.

Microsoft PowerPoint 2003 running under Windows XP Home Edition

Microsoft PowerPoint is a presentation program developed by Microsoft for its Microsoft Office system. Microsoft PowerPoint runs on Microsoft Windows and the Mac OS computer operating systems, although it originally ran under Xenix systems.

Page 106: IT MODULE 1

It is widely used by business people, educators, students, and trainers and is among the most prevalent forms of persuasion technology. Beginning with Microsoft Office 2003, Microsoft revised branding to emphasize PowerPoint's identity as a component within the Office suite: Microsoft began calling it Microsoft Office PowerPoint instead of merely Microsoft PowerPoint. The current version of Microsoft Office PowerPoint is Microsoft Office PowerPoint 2007. As a part of Microsoft Office, Microsoft Office PowerPoint has become the world's most widely used presentation program.

[] History

The about box for PowerPoint 1.0, with an empty document in the background.

The original Microsoft Office PowerPoint was developed by Bob Gaskins and software developer Dennis Austin as Presenter for Forethought, Inc, which they later renamed PowerPoint[1].

PowerPoint 1.0 was released in 1987 for the Apple Macintosh. It ran in black and white, generating text-and-graphics pages for overhead transparencies. A new full color version of PowerPoint shipped a year later after the first color Macintosh came to market.

Microsoft Corporation purchased Forethought and its PowerPoint software product for $14 million on July 31, 1987.[2] In 1990 the first Windows versions were produced. Since 1990, PowerPoint has been a standard part of the Microsoft Office suite of applications (except for the Basic Edition).

The 2002 version, part of the Microsoft Office XP Professional suite and also available as a stand-alone product, provided features such as comparing and merging changes in presentations, the ability to define animation paths for individual shapes, pyramid/radial/target and Venn diagrams, multiple slide masters, a "task pane" to view and select text and objects on the clipboard, password protection for presentations, automatic "photo album" generation, and the use of "smart tags" allowing people to quickly select the format of text copied into the presentation.

Microsoft Office PowerPoint 2003 did not differ much from the 2002/XP version. It enhanced collaboration between co-workers and featured "Package for CD", which makes it easy to burn presentations with multimedia content and the viewer on CD-ROM for distribution. It also improved support for graphics and multimedia.

The current version, Microsoft Office PowerPoint 2007, released in November 2006, brought major changes of the user interface and enhanced graphic capabilities. [3]

Page 107: IT MODULE 1

[] Operation

In PowerPoint, as in most other presentation software, text, graphics, movies, and other objects are positioned on individual pages or "slides". The "slide" analogy is a reference to the slide projector, a device which has become somewhat obsolete due to the use of PowerPoint and other presentation software. Slides can be printed, or (more often) displayed on-screen and navigated through at the command of the presenter. Slides can also form the basis of webcasts.

PowerPoint provides two types of movements. Entrance, emphasis, and exit of elements on a slide itself are controlled by what PowerPoint calls Custom Animations. Transitions, on the other hand are movements between slides. These can be animated in a variety of ways. The overall design of a presentation can be controlled with a master slide; and the overall structure, extending to the text on each slide, can be edited using a primitive outliner. Presentations can be saved and run in any of the file formats: the default .ppt (presentation), .pps (PowerPoint Show) or .pot (template). In PowerPoint 2007 the XML-based file formats .pptx, .ppsx and .potx have been introduced.

[] Compatibility

As Microsoft Office files are often sent from one computer user to another, arguably the most important feature of any presentation software—such as Apple's Keynote, or OpenOffice.org Impress—has become the ability to open Microsoft Office PowerPoint files. However, because of PowerPoint's ability to embed content from other applications through OLE, some kinds of presentations become highly tied to the Windows platform, meaning that even PowerPoint on Mac OS X cannot always successfully open its own files originating in the Windows version. This has led to a movement towards open standards, such as PDF and OASIS OpenDocument.

[] Cultural effects

Supporters & critics generally agree[4][5][6] that the ease of use of presentation software can save a lot of time for people who otherwise would have used other types of visual aid—hand-drawn or mechanically typeset slides, blackboards or whiteboards, or overhead projections. Ease of use also encourages those who otherwise would not have used visual aids, or would not have given a presentation at all, to make presentations. As PowerPoint's style, animation, and multimedia abilities have become more sophisticated, and as PowerPoint has become generally easier to produce presentations with (even to the point of having an "AutoContent Wizard" suggesting a structure for a presentation—initially started as a joke by the Microsoft engineers but later included as a serious feature in the 1990s), the difference in needs and desires of presenters and audiences has become more noticeable.

[] Criticism

Page 108: IT MODULE 1

One major source of criticism of PowerPoint comes from Yale professor of statistics and graphic design Edward Tufte, who criticizes many emergent properties of the software:[7]

It is used to guide and reassure a presenter, rather than to enlighten the audience; Unhelpfully simplistic tables and charts, resulting from the low resolution of

computer displays; The outliner causing ideas to be arranged in an unnecessarily deep hierarchy,

itself subverted by the need to restate the hierarchy on each slide; Enforcement of the audience's linear progression through that hierarchy (whereas

with handouts, readers could browse and relate items at their leisure); Poor typography and chart layout, from presenters who are poor designers and

who use poorly designed templates and default settings; Simplistic thinking, from ideas being squashed into bulleted lists, and stories with

beginning, middle, and end being turned into a collection of disparate, loosely disguised points. This may present a kind of image of objectivity and neutrality that people associate with science, technology, and "bullet points".

Tufte's criticism of the use of PowerPoint has extended to its use by NASA engineers in the events leading to the Columbia disaster. Tufte's analysis of a representative NASA PowerPoint slide is included in a full page sidebar entitled "Engineering by Viewgraphs" [8] in Volume 1 of the Columbia Accident Investigation Board's report.

[] Versions

Versions for the Mac OS include:

1987 PowerPoint 1.0 for Mac OS classic 1988 PowerPoint 2.0 for Mac OS classic 1992 PowerPoint 3.0 for Mac OS classic 1994 PowerPoint 4.0 for Mac OS classic 1998 PowerPoint 98 (8.0) for Mac OS classic (Office 1998 for mac) 2000 PowerPoint 2001 (9.0) for Mac OS X (Office 2001 for mac) 2002 PowerPoint v. X (10.0) for Mac OS X (Office:mac v. X) 2004 PowerPoint 2004 (11.0) for Mac OS X (Office:mac 2004) 2008 PowerPoint 2008 (12.0) for Mac OS X (Office:mac 2008)

Note: There is no PowerPoint 5.0 , 6.0 or 7.0 for Mac. There is no version 5.0 or 6.0 because the Windows 95 version was launched with Word 7. All of the Office 95 products have OLE 2 capacity - moving data automatically from various programs - and PowerPoint 7 shows that it was contemporary with Word 7. There wasn't any version 7.0 made for mac to coincide with neither version 7.0 for windows nor PowerPoint 97.[9].[10].

Microsoft PowerPoint 4.0 - 2007 Icons (Windows versions)

Page 109: IT MODULE 1

Versions for Microsoft Windows include:

1990 PowerPoint 2.0 for Windows 3.0 1992 PowerPoint 3.0 for Windows 3.1 1993 PowerPoint 4.0 (Office 4.x) 1995 PowerPoint for Windows 95 (version 7.0) — (Office 95) 1997 PowerPoint 97 — (Office '97) 1999 PowerPoint 2000 (version 9.0) — (Office 2000) 2001 PowerPoint 2002 (version 10) — (Office XP) 2003 PowerPoint 2003 (version 11) — (Office 2003) 2006 -2007 PowerPoint 2007 (version 12) — (Office 2007)

Computer softwareComputer software, consisting of programs, enables a computer to perform specific tasks, as opposed to its physical components (hardware) which can only do the tasks they are mechanically designed for. The term includes application software such as word processors which perform productive tasks for users, system software such as operating systems, which interface with hardware to run the necessary services for user-interfaces and applications, and middleware which controls and co-ordinates distributed systems.

[] Terminology

The term "software" is sometimes used in a broader context to describe any electronic media content which embodies expressions of ideas such as film, tapes, records, etc.[1]

A screenshot of computer software - AbiWord.

[] Relationship to computer hardware

Main article: Computer hardware

Computer software is so called in contrast to computer hardware, which encompasses the physical interconnections and devices required to store and execute (or run) the software. In computers, software is loaded into RAM and executed in the central processing unit. At the lowest level, software consists of a machine language specific to an individual processor. A machine language consists of groups of binary values signifying processor instructions (object code), which change the state of the computer from its preceding state. Software is an ordered sequence of instructions for changing the state of the computer hardware in a particular sequence. It is usually written in high-level programming languages that are easier and more efficient for humans to use (closer to natural language) than machine language. High-level languages are compiled or

Page 110: IT MODULE 1

interpreted into machine language object code. Software may also be written in an assembly language, essentially, a mnemonic representation of a machine language using a natural language alphabet. Assembly language must be assembled into object code via an assembler.

The term "software" was first used in this sense by John W. Tukey in 1958.[2] In computer science and software engineering, computer software is all computer programs. The concept of reading different sequences of instructions into the memory of a device to control computations was invented by Charles Babbage as part of his difference engine. The theory that is the basis for most modern software was first proposed by Alan Turing in his 1935 essay Computable numbers with an application to the Entscheidungsproblem.[3]

[] Types

Practical computer systems divide software systems into three major classes: system software, programming software and application software, although the distinction is arbitrary, and often blurred.

System software helps run the computer hardware and computer system. It includes operating systems, device drivers, diagnostic tools, servers, windowing systems, utilities and more. The purpose of systems software is to insulate the applications programmer as much as possible from the details of the particular computer complex being used, especially memory and other hardware features, and such accessory devices as communications, printers, readers, displays, keyboards, etc.

System software is a generic term referring to any computer software which manages and controls the hardware so that application software can perform a task. It is an essential part of the computer system. An operating system is an obvious example, while an OpenGL or database library are less obvious examples. System software contrasts with application software, which are programs that help the end-user to perform specific, productive tasks, such as word processing or image manipulation.

If system software is stored on non-volatile storage such as integrated circuits, it is usually termed firmware.

Systems software – a set of programs that organise, utilise and control hardware in a computer system

Programming software usually provides tools to assist a programmer in writing computer programs and software using different programming languages in a

Page 111: IT MODULE 1

more convenient way. The tools include text editors, compilers, interpreters, linkers, debuggers, and so on. An Integrated development environment (IDE) merges those tools into a software bundle, and a programmer may not need to type multiple commands for compiling, interpreter, debugging, tracing, and etc., because the IDE usually has an advanced graphical user interface, or GUI.

A programming tool or software tool is a program or application that software developers use to create, debug, or maintain other programs and applications. The term usually refers to relatively simple programs that can be combined together to accomplish a task, much as one might use multiple hand tools to fix a physical object.

[] History

The history of software tools began with the first computers in the early 1950s that used linkers, loaders, and control programs. Tools became famous with Unix in the early 1970s with tools like grep, awk and make that were meant to be combined flexibly with pipes. The term "software tools" came from the book of the same name by Brian Kernighan and P. J. Plauger.

Tools were originally simple and light weight. As some tools have been maintained, they have been integrated into more powerful integrated development environments (IDEs). These environments consolidate functionality into one place, sometimes increasing simplicity and productivity, other times sacrificing flexibility and extensibility. The workflow of IDEs is routinely contrasted with alternative approaches, such as the use of Unix shell tools with text editors like Vim and Emacs.

The distinction between tools and applications is murky. For example, developers use simple databases (such as a file containing list of important values) all the time as tools. However a full-blown database is usually thought of as an application in its own right.

For many years, computer-assisted software engineering (CASE) tools were sought after. Successful tools have proven elusive. In one sense, CASE tools emphasized design and architecture support, such as for UML. But the most successful of these tools are IDEs.

The ability to use a variety of tools productively is one hallmark of a skilled software engineer.

[] List of tools

Software tools come in many forms:

Revision control : Bazaar, Bitkeeper, Bonsai, ClearCase, CVS, Git, GNU arch, Mercurial, Monotone, PVCS, RCS, SCM, SCCS, SourceSafe, SVN, LibreSource Synchronizer

Interface generators: Swig

Page 112: IT MODULE 1

Build Tools: Make, automake, Apache Ant, SCons, Rake Compilation and linking tools: GNU toolchain, gcc, Microsoft Visual Studio,

CodeWarrior, Xcode, ICC Static code analysis : lint, Splint Search: grep, find Text editors : emacs, vi Scripting languages : Awk, Perl, Python, REXX, Ruby, Shell, Tcl Parser generators : Lex, Yacc, Parsec Bug Databases : gnats, Bugzilla, Trac, Atlassian Jira, LibreSource Debuggers : gdb, GNU Binutils, valgrind Memory Leaks/Corruptions Detection : dmalloc, Electric Fence, duma, Insure++ Memory use: Aard Code coverage : GCT, CCover Source-Code Clones/Duplications Finding: CCFinderX Refactoring Browser Code Sharing Sites: Freshmeat, Krugle, Sourceforge, ByteMyCode, UCodit Source code generation tools Documentation generators: Doxygen, help2man, POD, Javadoc, Pydoc/Epydoc

Debugging tools also are used in the process of debugging code, and can also be used to create code that is more compliant to standards and portable than if they were not used.

Memory leak detection: In the C programming language for instance, memory leaks are not as easily detected - software tools called memory debuggers are often used to find memory leaks enabling the programmer to find these problems much more efficiently than inspection alone.

[] IDEs

Integrated development environments (IDEs) combine the features of many tools into one complete package. They are usually simpler and make it easier to do simple tasks, such as searching for content only in files in a particular project.

IDEs are often used for development of enterprise-level applications.

Some examples of IDEs are:

Delphi C++ Builder Microsoft Visual Studio Xcode Eclipse NetBeans IntelliJ IDEA WinDev

Page 113: IT MODULE 1

Application software allows end users to accomplish one or more specific (non-computer related) tasks. Typical applications include industrial automation, business software, educational software, medical software, databases, and computer games. Businesses are probably the biggest users of application software, but almost every field of human activity now uses some form of application software. It is used to automate all sorts of functions.

Application software is a subclass of computer software that employs the capabilities of a computer directly and thouroghly to a task that the user wishes to perform. This should be contrasted with system software which is involved in integrating a computer's various capabilities, but typically does not directly apply them in the performance of tasks that benefit the user. In this context the term application refers to both the application software and its implementation.

A simple, if imperfect analogy in the world of hardware would be the relationship of an electric light as an example of an application to an electric power generation plant as an example of a system. The power plant merely generates electricity, not itself of any real use until harnessed to an application like the electric light that performs a service that the user desires.

The exact delineation between the operating system and application software is not precise, however, and is occasionally subject to controversy. For example, one of the key questions in the United States v. Microsoft antitrust trial was whether Microsoft's Internet Explorer web browser was part of its Windows operating system or a separable piece of application software. As another example, the GNU/Linux naming controversy is, in part, due to disagreement about the relationship between the Linux kernel and the Linux operating system.

Typical examples of software applications are word processors, spreadsheets, and media players.

Multiple applications bundled together as a package are sometimes referred to as an application suite. Microsoft Office and OpenOffice.org, which bundle together a word processor, a spreadsheet, and several other discrete applications, are typical examples. The separate applications in a suite usually have a user interface that has some commonality making it easier for the user to learn and use each application. And often they may have some capability to interact with each other in ways beneficial to the user. For example, a spreadsheet might be able to be embedded in a word processor document even though it had been created in the separate spreadsheet application.

User-written software tailors systems to meet the user's specific needs. User-written software include spreadsheet templates, word processor macros, scientific simulations,

Page 114: IT MODULE 1

graphics and animation scripts. Even email filters are a kind of user software. Users create this software themselves and often overlook how important it is.

In some types of embedded systems, the application software and the operating system software may be indistinguishable to the user, as in the case of software used to control a VCR, DVD player or Microwave Oven.

OpenOffice.org is a well-known example of application software

[] Application software classification

There are many subtypes of application software:

Enterprise software addresses the needs of organization processes and data flow, often in a large distributed ecosystem. (Examples include Financial, Customer Relationship Management, and Supply Chain Management). Note that Departmental Software is a sub-type of Enterprise Software with a focus on smaller organizations or groups within a large organization. (Examples include Travel Expense Management, and IT Helpdesk)

Enterprise infrastructure software provides common capabilities needed to create Enterprise Software systems. (Examples include Databases, Email servers, and Network and Security Management)

Information worker software addresses the needs of individuals to create and manage information, often for individual projects within a department, in contrast to enterprise management. Examples include time management, resource management, documentation tools, analytical, and collaborative. Word processors, spreadsheets, email and blog clients, personal information system, and individual media editors may aid in multiple information worker tasks.

Media and entertainment software addresses the needs of individuals and groups to consume digital entertainment and published digital content. (Examples include Media Players, Web Browsers, Help browsers, and Games)

Educational software is related to Media and Entertainment Software, but has distinct requirements for delivering evaluations (tests) and tracking progress through material. It is also related to collaboration software in that many Educational Software systems include collaborative capabilities.

Media development software addresses the needs of individuals who generate print and electronic media for others to consume, most often in a commercial or educational setting. This includes Graphic Art software, Desktop Publishing software, Multimedia Development software, HTML editors, Digital Animation editors, Digital Audio and Video composition, and many others.

Product engineering software is used in developing hardware and software products. This includes computer aided design (CAD), computer aided

Page 115: IT MODULE 1

engineering (CAE), computer language editing and compiling tools, Integrated Development Environments, and Application Programmer Interfaces.

[] Program and library

A program may not be sufficiently complete for execution by a computer. In particular, it may require additional software from a software library in order to be complete. Such a library may include software components used by stand-alone programs, but which cannot work on their own. Thus, programs may include standard routines that are common to many programs, extracted from these libraries. Libraries may also include 'stand-alone' programs which are activated by some computer event and/or perform some function (e.g., of computer 'housekeeping') but do not return data to their calling program. Programs may be called by one to many other programs; programs may call zero to many other programs.

[] Three layers

Starting in the 1980s, application software has been sold in mass-produced packages through retailers.

See also: Software architecture

Users often see things differently than programmers. People who use modern general purpose computers (as opposed to embedded systems, analog computers, supercomputers, etc.) usually see three layers of software performing a variety of tasks: platform, application, and user software.

Platform software Platform includes the firmware, device drivers, an operating system, and typically a graphical user interface which, in total, allow a user to interact with the computer and its peripherals (associated equipment). Platform software often comes bundled with the computer. On a PC you will usually have the ability to change the platform software.

Application software Application software or Applications are what most people think of when they think of software. Typical examples include office suites and video games. Application software is often purchased separately from computer hardware. Sometimes applications are bundled with the computer, but that does not change the fact that they run as independent applications. Applications are almost always

Page 116: IT MODULE 1

independent programs from the operating system, though they are often tailored for specific platforms. Most users think of compilers, databases, and other "system software" as applications.

User-written software User software tailors systems to meet the users specific needs. User software include spreadsheet templates, word processor macros, scientific simulations, and scripts for graphics and animations. Even email filters are a kind of user software. Users create this software themselves and often overlook how important it is. Depending on how competently the user-written software has been integrated into purchased application packages, many users may not be aware of the distinction between the purchased packages, and what has been added by fellow co-workers.

[] Creation

Main article: Computer programming

[] Operation

Computer software has to be "loaded" into the computer's storage (such as a hard drive, memory, or RAM). Once the software is loaded, the computer is able to execute the software. Computers operate by executing the computer program. This involves passing instructions from the application software, through the system software, to the hardware which ultimately receives the instruction as machine code. Each instruction causes the computer to carry out an operation -- moving data, carrying out a computation, or altering the control flow of instructions.

Data movement is typically from one place in memory to another. Sometimes it involves moving data between memory and registers which enable high-speed data access in the CPU. Moving data, especially large amounts of it, can be costly. So, this is sometimes avoided by using "pointers" to data instead. Computations include simple operations such as incrementing the value of a variable data element. More complex computations may involve many operations and data elements together.

Instructions may be performed sequentially, conditionally, or iteratively. Sequential instructions are those operations that are performed one after another. Conditional instructions are performed such that different sets of instructions execute depending on the value(s) of some data. In some languages this is known as an "if" statement. Iterative instructions are performed repetitively and may depend on some data value. This is sometimes called a "loop." Often, one instruction may "call" another set of instructions that are defined in some other program or module. When more than one computer processor is used, instructions may be executed simultaneously.

A simple example of the way software operates is what happens when a user selects an entry such as "Copy" from a menu. In this case, a conditional instruction is executed to copy text from data in a 'document' area residing in memory, perhaps to an intermediate

Page 117: IT MODULE 1

storage area known as a 'clipboard' data area. If a different menu entry such as "Paste" is chosen, the software may execute the instructions to copy the text from the clipboard data area to a specific location in the same or another document in memory.

Depending on the application, even the example above could become complicated. The field of software engineering endeavors to manage the complexity of how software operates. This is especially true for software that operates in the context of a large or powerful computer system.

Currently, almost the only limitations on the use of computer software in applications is the ingenuity of the designer/programmer. Consequently, large areas of activities (such as playing grand master level chess) formerly assumed to be incapable of software simulation are now routinely programmed. The only area that has so far proved reasonably secure from software simulation is the realm of human art— especially, pleasing music and literature.[citation needed]

Kinds of software by operation: computer program as executable, source code or script, configuration.

[] Quality and reliability

Software reliability considers the errors, faults, and failures related to the creation and operation of software.

See Software auditing, Software quality, Software testing, and Software reliability.

[] License

Software license gives the user the right to use the software in the licensed environment, some software comes with the license when purchased off the shelf, or an OEM license when bundled with hardware. Other software comes with a free software licence, granting the recipient the rights to modify and redistribute the software. Software can also be in the form of freeware or shareware. See also License Management.

[] Patents

The issue of software patents is controversial. Some believe that they hinder software development, while others argue that software patents provide an important incentive to spur software innovation. See software patent debate.

[] Ethics and rights for software users

Being a new part of society, the idea of what rights users of software should have is not very developed. Some, such as the free software community, believe that software users

Page 118: IT MODULE 1

should be free to modify and redistribute the software they use. They argue that these rights are necessary so that each individual can control their computer, and so that everyone can cooperate, if they choose, to work together as a community and control the direction that software progresses in. Others believe that software authors should have the power to say what rights the user will get.

The former philosophy is somewhat derived from the "hacker ethic" that was common in the 60s and 70s.