21
LING 408/508: Programming for Linguists Lecture 1 August 24 th

LING 408/508: Programming for Linguists Lecture 1 August 24 th

Embed Size (px)

Citation preview

Page 1: LING 408/508: Programming for Linguists Lecture 1 August 24 th

LING 408/508: Programming for Linguists

Lecture 1August 24th

Page 2: LING 408/508: Programming for Linguists Lecture 1 August 24 th

Administrivia

1. Syllabus2. Introduction3. Quickie Homework 1 – due Wednesday night by midnight in my e-mailbox

• I will assume everyone has a laptop…

Page 3: LING 408/508: Programming for Linguists Lecture 1 August 24 th

Syllabus• Details

– Instructor: Sandiway FongDepts. Linguistics and Computer Science

– Email: [email protected] (homeworks go here)– Office: Douglass 311– Hours: by appt. or walk-in, after class– Meet: Shantz 338, Mondays/Wednesdays 3:15-4:30pm– No class on:

• Monday September 7th (Labor Day)• Wednesday November 11th (Veterans Day)• Week after September 11th (out of town), plus Monday 21st • Monday October 12th

– My academic webpage: • dingo.sbs.arizona.edu/~sandiway/ or just Google me

Page 4: LING 408/508: Programming for Linguists Lecture 1 August 24 th

Syllabus

• dingo.sbs.arizona.edu/~sandiway/– Lecture slides will be available online (.pptx and pdf formats) just

before class • look for updates/corrections afterwards

– Lectures will be recorded using the panopto system (video, laptop screen, synchronized slides, keyword search)

Page 5: LING 408/508: Programming for Linguists Lecture 1 August 24 th

Syllabus

• Pre-requisites:– none!

• Course Objectives:– Introduction to programming

• data types, different programming styles, thinking algorithmically …

– and fundamental computer concepts• computer organization: underlying hardware, and operating systems

(processes, shell, filesystem etc.)

– Operating System: • Ubuntu (Linux)

– Programming Languages: • selected examples: Bash shell, Python, Javascript, Perl, Tcl/Tk,

HTML/CSS, MySQL, cgi-bin etc.

Page 6: LING 408/508: Programming for Linguists Lecture 1 August 24 th

Syllabus

• Expected learning outcomes:– familiarity with the underlying technology,

terminology and programming concepts– acquire the ability to think algorithmically– acquire the ability to write short programs– build a graphical user interface– build a web application (with a relational database)– be equipped to take classes in the Human Language

Technology (HLT) program and related classes

Page 7: LING 408/508: Programming for Linguists Lecture 1 August 24 th

Syllabus• Grading

– 408: homeworks (80%), term programming project (20%)– 508: homeworks (65%), term programming project (35%)– Note: requirement – you must submit all homeworks

• Homework submissions:– email only to me– See homework 1 for the required format …– homeworks will be introduced in class– due date:

• almost one week (typically)• example: homework presented in class on Monday (resp. Wednesday), due Sunday

(resp. Tuesday) night by 11:59pm in my mailbox

– all homeworks will be reviewed in class

Page 8: LING 408/508: Programming for Linguists Lecture 1 August 24 th

Syllabus• Homeworks

– you may discuss questions with other students– however, you must program/write it up yourself (in your own

words/code)– cite (web) references and your classmates (in the case of discussion)– Student Code of Academic Integrity: plagiarism etc.

• http://deanofstudents.arizona.edu/codeofacademicintegrity

• Revisions to the syllabus– “the information contained in the course syllabus, other than the

grade and absence policies, may be subject to change with reasonable advance notice, as deemed appropriate by the instructor.”

Page 9: LING 408/508: Programming for Linguists Lecture 1 August 24 th

Syllabus

• Absences– tell me ahead of time so we can make special

arrangements, e.g. homeworks– I expect you to attend lectures (though attendance

will not be taken)

Page 10: LING 408/508: Programming for Linguists Lecture 1 August 24 th

Introduction

• Computers– Memory• Programs and data

– CPU• Interprets machine instructions

– I/O• keyboard, mouse, touchpad, screen, touch sensitive

screen, printer, usb port, etc.• bluetooth, ethernet, wifi, cellular …

Page 11: LING 408/508: Programming for Linguists Lecture 1 August 24 th

Introduction

• Memory– CPU registers– L1/L2 cache– RAM– SSD/hard drive– blu ray/dvd/cd drive

fast

slow

invisible to programmers

open fileread/write

Page 12: LING 408/508: Programming for Linguists Lecture 1 August 24 th

Introduction

• Memory Representation– binary: zeros and ones (1 bit)– organized into bytes (8 bits)

• memory is byte-addressable

– word (32 bits)• e.g. integer• (64 bits: floating point number)

– big-endian/little-endian• most significant byte first or least

significant byte• communication …

addressableMemory

(RAM)

0

FFFFFFFF

your Intel and ARM CPUs

arraya[23]

Page 13: LING 408/508: Programming for Linguists Lecture 1 August 24 th

Introduction

• A typical notebook computer– Example: a 2013 Macbook Air– CPU: Core i5-4250U

• 1.3 billion transistors• built-in GPU• TDP: 15W (1.3 GHz)• Dual core (Turbo: 2.6 GHz)• Hyper-Threaded (4 logical CPUs, 2 physical)• 64 bit • 64 KB (32 KB Instruction + 32 KB Data) L1 cache • 256 KB L2 cache per core• 12MB L3 cache shared• 16GB max RAM

Increased address space

and 64-bit registers

Page 14: LING 408/508: Programming for Linguists Lecture 1 August 24 th

Introduction

anandtech.comA 4 core machine: 8 virtual

Page 15: LING 408/508: Programming for Linguists Lecture 1 August 24 th

Introduction• Machine Language

– A CPU understands only one language: machine language• all other languages must be translated into machine language

– Primitive instructions include:• MOV• PUSH• POP• ADD / SUB• INC / DEC• IMUL / IDIV• AND / OR / XOR / NOT• NEG• SHL / SHR• JMP• CMP• JE / JNE / JZ / JG / JGE / JL / JLE• CALL / RET

Assembly Language: (this notation)by definition, nothing built on it is more powerful

http://www.cs.virginia.edu/~evans/cs216/guides/x86.html

Page 16: LING 408/508: Programming for Linguists Lecture 1 August 24 th

Introduction

• Not all the machine instructions are conceptually necessary– many provided for speed/efficiency

• Theoretical Computer Science– All mechanical computation can be carried out using a TURING

MACHINE– Finite state table + (infinite) tape– Tape instructions:

• at the tape head: Erase, Write, Move (Left/Right/NoMove)

– Finite state table:• Current state x Tape symbol --> new state x New Tape symbol x Move

Page 17: LING 408/508: Programming for Linguists Lecture 1 August 24 th

Introduction• Storage:

– based on digital logic– binary (base 2) – everything is a power of 2– Byte: 8 bits

• 01011011 • = 26+24+23+21+20 • = 64 + 16 + 8 + 2 + 1 • = 91 (in decimal)

– Hexadecimal (base 16)• 0-9,A,B,C,D,E,F (need 4 bits)• 5B (= 1 byte)• = 5*161 + 11 • = 80 + 11• = 91

27 26 25 24 23 22 21 2027 26 25 24 23 22 21 20

0 1 0 1 1 0 1 1

23 22 21 20 23 22 21 20

5 B

23 22 21 20 23 22 21 20

0 1 0 1

23 22 21 20 23 22 21 20

0 1 0 1 1 0 1 1

161 160161 160

5 B

Page 18: LING 408/508: Programming for Linguists Lecture 1 August 24 th

Introduction: data types

• Integers– In one byte (= 8 bits), what’s the largest and smallest

number, we can represent?– 00000000 = 0– 01111111 = 127– 10000000 = -128– 11111111 = -1

0 … 127 -128 -127 -1

00000000 11111111

2’s complement representation

Page 19: LING 408/508: Programming for Linguists Lecture 1 August 24 th

Introduction: data types• Integers

– In one byte, what’s the largest and smallest number, we can represent?

– Answer: -128 .. 0 .. 127 using the 2’s complement representation– Why? super-convenient for arithmetic operations– “to convert a positive integer X to its negative counterpart, flip all the

bits, and add 1”– Example: – 00001010 = 23 + 21 = 10 (decimal)– 11110101 + 1 = 11110110 = -10 (decimal)– 11110110 flip + 1 = 00001001 + 1 = 00001010Addition:

-10 + 10= 11110110 + 00001010 = 0 (ignore overflow)

Page 20: LING 408/508: Programming for Linguists Lecture 1 August 24 th

Introduction: data types

• Typically 32 bits (4 bytes) are used to store an integer– range: -2,147,483,648 (2(31-1) -1) to 2,147,483,647 (2(32-1) -1)

• what if you want to store even larger numbers?– Binary Coded Decimal (BCD)– code each decimal digit separately, use a string

(sequence) of decimal digits …

231 230 229 228 227 226 225 224 … … 27 26 25 24 23 22 21 20

byte 3 byte 2 byte 1 byte 0 C:int

Page 21: LING 408/508: Programming for Linguists Lecture 1 August 24 th

Introduction: data types

• what if you want to store even larger numbers?– Binary Coded Decimal (BCD)– 1 byte can code two digits (0-9 requires 4 bits)– 1 nibble (4 bits) codes the sign (+/-), e.g. hex C/D

23 22 21 20

0 0 0 0

23 22 21 20

0 0 0 1

23 22 21 20

1 0 0 1

0

1

9

2 0 1 4

2 bytes (= 4 nibbles)

+ 2 0 1 4

2.5 bytes (= 5 nibbles)

23 22 21 20

1 1 0 0 C23 22 21 20

1 1 0 1 Dcredit (+) debit (-)