View
220
Download
1
Category
Tags:
Preview:
Citation preview
Living with spreadsheets
Dean Buckner
Financial Services Authority
JULY 2011
AGENDA
• Recap on last year’s talk
– Why we won’t get rid of spreadsheets
• But how can we live with them?
Why we won’t get rid of spreadsheets
• The tower of Babel
• Early views on machine translation (and why they failed)
• The computer Babel
The tower of Babel
• “And the whole earth was of one language, and of one speech.
• “And they said, Go to, let us build us a city and a tower, whose top may reach unto heaven; and let us make us a name, lest we be scattered abroad upon the face of the whole earth.
• “And the Lord said, Behold, the people is one, and they have all one language; and this they begin to do; and now nothing will be restrained from them, which they have imagined to do.
• “Go to, let us go down, and there confound their language, that they may not understand one another's speech.
• “Therefore is the name of it called Babel; because the Lord did there confound the language of all the earth.”
Machine translation
• Proposals for mechanical translators of languages pre-date the invention of the digital computer. The first recognisable application was a dictionary look-up system developed at Birkbeck College, London in 1948.
Code breaking• Warren Weaver had been involved in code-breaking during
the Second World War. • A simple idea: given that humans of all nations are much the
same (in spite of speaking a variety of languages), a document in one language could be viewed as having been written in code.
• Once this code was broken, it would be possible to output the document in another language.
• From this point of view, Chinese was English in code.• “… one naturally wonders if the problem of translation could
conceivably be treated as a problem in cryptography. When I look at an article in Russian, I say: "This is really written in English, but it has been coded in some strange symbols. I will now proceed to decode."
• http://www.mt-archive.info/Weaver-1949.pdf
It failed
• US funding of Machine Translation research cost the U.S. public $20 million by the mid 1960s. The Automatic Language Processing Advisory Committee (ALPAC) produced a report on the results of the funding and concluded that "there had been no machine translation of general scientific text, and none is in immediate prospect".
It failed again?
• There was renewed interest in the 1980s with the emergence of the ‘artificial intelligence’ idea.
• (At least if Google translator is anything to go by)
– Seinen Lebensabend verbrachte in bad kleinen, in der Nähe seiner Geburtsstadt Wismar.
– His life was spent in small bathroom, near his hometown of Wismar.
Why it is difficult
• The teacher sent the boy to the headmaster because
– he wanted to see him
– he had been throwing stones
– he was fed up with his bad behaviour
The computerised Babel• In the beginning was the mainframe
– Keep the ‘meaning’ of every symbol in just one place, and have everything else inside the system point to it directly (a ‘pointer’ is simply a mechanical means of moving from one address to another’)
– Force users either to check their translation by means of a ‘compiler’ (this is for users called ‘programmers’)
– or have them enter information by means of a menu system that forces acceptable choice (for common or garden users).
– This worked reasonably well until the 1990s
The tower crumbles
• The 1980s and 1990s saw increasing specialisation of systems
– General ledger systems
– Payment systems
– Loan systems
– Claims systems etc
• They couldn’t talk to each other
The modern Babel
• A modern bank or insurance company contains dozens, perhaps hundreds of disparate systems.
• There is no ‘compiler’ to allow communication between them
• Spreadsheets are the solution to this communication problem
Deceptively difficult problems
• Deceptively difficult problem: a problem whose solution seems easy
– particularly by the application of ‘technology’• But isn’t• As we saw, communication between
systems is incredibly difficult – not like ‘code-breaking’ at all
• But it seems easy– I say: "This is really written in English, but it has
been coded in some strange symbols. I will now proceed to decode."
Apparently easy solutions (1)
• The Internet
– The Internet became embedded in popular consciousness in the 1990s and 2000s
– The problem of sending data from one place to another seemed to be solved
– But it didn’t solve the communication problem
– The Chinese send a letter to English speakers, who receive it OK. But no one understands it.
Apparently easy solutions (2)• Data warehouses
– An apparently simple solution– Send all the data from disparate source systems into one
place (the ‘warehouse’)– Then you have it all in one place
• But the problem remains – you have all the different languages in one room
– And no one understands each other– Even worse, when the translation was done on
spreadsheets, at least the users understood what was going on
– Now nobody does
Large spreadsheet systems
• Spreadsheet systems are becoming huge
– We saw a 600 spreadsheet system last year. That seemed big
– Then we saw a 1,000 sheet system. That was even bigger.
– Then we found a 9,000 sheet system. That was awesome.
• What do we do?
Dangers of large systems
• Large spreadsheet systems are like mainframes
– But they don’t have a central compiler
– The embedded risks are huge
Examples
• Hard-coded references passing unchecked through many spreadsheets
– Date, source, and type of data is completely opaque
– Nature of transformations completely unclear.
– Location of transformations unknown
Examples
• Senior management sees only immediate source sheets
– Under a dozen seems manageable
– But they don’t see the hundreds or thousands of sheets that are feeding the dozen.
– Tip of iceberg
Solving the problem
• [this page deliberately left blank]
Questions & Comments
Recommended