52
1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et al .

1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

Embed Size (px)

Citation preview

Page 1: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

1

Information Management

DIG 3563

Spring 2012

Lecture 15:Metadata Removal

J. Michael Moshell

University of Central Florida

Original image* by Moshell et al .

Page 2: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-2 -Excerpt from DA Text

Metadata Management

(Including removal)

Introductory movie: Why we need to do this:

http://www.youtube.com/watch?v=-sVUMBuJ0YY&feature=player_embedded

Page 3: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-3 -nathansphere.com

Metadata Management

(Including removal)

So why not just

download a freebie

software removal tool?

FREEBIES ARE

DANGEROUS.

Many carry Trojans

into your computer.

Page 4: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-4 -

Metadata ManagementWhere to get trustworthy software?

I use www.cnet.com

They virus-check their stuff quite carefully

(But you STILL need your own anti-virus software)

I decide try out "File Buddy" on the Mac.

It has **** rating ...

Page 5: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-5 -

Metadata ManagementWhere to get trustworthy software?

I use www.cnet.com

They virus-check their stuff quite carefully

(But you STILL need your own anti-virus software)

I decide try out "File Buddy" on the Mac.

It has **** rating ...

I work with it for 20 minutes ... it's not doing much.

I go to the URL to look for documentation

Page 6: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-6 -

Metadata ManagementWhere to get trustworthy software?

I use www.cnet.com

They virus-check their stuff quite carefully

(But you STILL need your own anti-virus software)

I decide try out "File Buddy" on the Mac.

It has **** rating ..

www.skytag.com 404

They're GONE. Nobody to answer

questions ...> orphan software.

MORAL: Check for 'liveness' first, on free software.

Page 7: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-7 -

Metadata ManagementMicrosoft Products

Every version is different (@#$%!!)_

MS Word Mac 2011:

Word/Preferences/

Security:

Page 8: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-8 -

Metadata ManagementMicrosoft Word:File/Properties

Page 9: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-9 -

Metadata ManagementMicrosoft Word:File/Properties

Fill in existing tags; or create new tags

Page 10: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-10 -

Metadata ManagementMicrosoft Products

Every version is different (@#$%!!)_

MS Word Powerpoint 2011:

Powerpoint/Preferences/

Advanced:

Page 11: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-11 -

Metadata ManagementIIn File/Properties, I find:

Page 12: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-12 -

Metadata ManagementMicrosoft Operating Systems

Windows XP's meta-information access tools

could damage the metadata in JPG and other files.

Windows Vista and 7 are SUPPOSED to be safe ... but...

Page 13: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-13 -

Some Metadata FormatsEXIF is the digital camera metadata tag standard

- Based on TIFF metadata

- Typical data:

Page 14: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-14 -

Some Metadata FormatsIPTC is International Press Telecommunications Council

- Stores who/what/when/where type data

- Typical data:

Page 15: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-15 -

Some Metadata FormatsIPTC is International Press Telecommunications Council

- What's an IPTC code? http://www.iptc.org/cms/site/index.html?channel=CH0088

Examples:subj:01000000 – arts, culture and entertainment

subj:01005000 – cinema

subj:01005001 – film festival

Page 16: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-16 -

Some Metadata FormatsIPTC is International Press Telecommunications Council

- Stores who/what/when/where type data

- More data:

Page 17: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-17 -

Some Metadata FormatsIPTC is International Press Telecommunications Council

Some cool XML behind the scenes in the IPTC Codes:

<concept id="medtop01000000">

<conceptId created="2009-10-22T02:00:00+00:00" qcode="medtop:01000000"/>

<type qcode="cpnat:abstract"/>

<name xml:lang="en-GB">arts, culture and entertainment</name>

<definition xml:lang="en-GB">Matters pertaining to the advancement and

refinement of the human mind, of interests, skills, tastes and emotions

</definition>

<related rel="skos:exactMatch" qcode="subj:01000000"/>

</concept>

XML features in use: attribute-value pairs

Page 18: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-18 -

Some Metadata FormatsIPTC is International Press Telecommunications Council

Some cool XML behind the scenes:

<concept id="medtop01000000">

<conceptId created="2009-10-22T02:00:00+00:00" qcode="medtop:01000000"/>

<type qcode="cpnat:abstract"/>

<name xml:lang="en-GB">arts, culture and entertainment</name>

<definition xml:lang="en-GB">Matters pertaining to the advancement and

refinement of the human mind, of interests, skills, tastes and emotions

</definition>

<related rel="skos:exactMatch" qcode="subj:01000000"/>

</concept>

XML features in use: self-terminating element with two attributes

Page 19: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-19 -

Some Metadata FormatsIPTC is International Press Telecommunications Council

Some cool XML behind the scenes:

<concept id="medtop01000000">

<conceptId created="2009-10-22T02:00:00+00:00" qcode="medtop:01000000"/>

<type qcode="cpnat:abstract"/>

<name xml:lang="en-GB">arts, culture and entertainment</name>

<definition xml:lang="en-GB">Matters pertaining to the advancement and

refinement of the human mind, of interests, skills, tastes and emotions

</definition>

<related rel="skos:exactMatch" qcode="subj:01000000"/>

</concept>

XML features in use: namespace

Page 20: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-20 -

Metadata ManagementA handy Adobe CSx Tool: ADOBE BRIDGE

(I'd been wondering what it was for!)

File organizer

Metadata editor (kinda...)

(Demonstrate it – but unfortunately I can't, on this computer)

IPTC Restricted field: Date requires like 11/13/2011

Page 21: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-21 -

Bridge's Limitations

However: Bridge can only edit

the IPTC and some other metadata – and

there is much more out there

that needs editing!

Page 22: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-22 -

From Bridge, we observeA Medical metadata standard: DICOM

Digital Imaging and Communications in Medicine

Page 23: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-23 -

Metadata ManagementThe 800 pound idealistic volunteer:

Phil Harvey, who created:::: (drum roll) :::

"The Pan Galactic Gargle Blaster of EXIF tools" Open Photography Forums

EXIFtool:

http://owl.phy.queensu.ca/~phil/exiftool/

Phil's rant on metadata:

http://owl.phy.queensu.ca/~phil/exiftool/commentary.html

Page 24: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-24 -

EXIFtoolEXIF is a Command Line Tool

Windows Console: Inherited DOS commands

(A granddaughtrer of Unix' command set)

Mac Console ("Terminal"):

Uses UNIX/Linux Commands

So you need to know a minimal set of these commands!

Page 25: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-25 -

Unix CommandsPractice these on sulley via telnet

macishly: open Terminal, then

>ssh [email protected]

"ssh" means "secure shell". Replace 'youraccount'

with your Sulley login, which is usually your NID.

You will be asked for your password.

NOTE: This only works (a) on campus, or (b) via VPN

Page 26: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-26 -

Unix CommandsI will demonstrate via Mac's Terminal.

Windows folk: Find out the same for your system

- but you must know the UNIX stuff, for exams

>ls : "list", shows the current directory

>ls –la : "list long form", shows details

>cd mywork : moves viewpoint into the 'mywork' dir.

If there is only one file that starts with 'my', you can

also type cd my*

(It's a very convenient short-cut.)

Page 27: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-27 -

Unix CommandsI will demonstrate via Mac's Terminal.

Windows folk: Find out the same for your system

- but you must know the UNIX stuff, for exams

>ls : "list", shows the current directory

>ls –la : "list long form", shows details

>cd mywork : moves viewpoint into the 'mywork' dir.

mywork>cd .. :moves "rootward" in hierarchy.

>cp filea fileb : copies filea into fileb

>mv filea fileb :renames filea into fileb

>mv filea mywork/. :moves filea into mywork dir.

Page 28: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-28 -

Unix Commands

>pwd

/Users/michaelmoshell/Documents/UCF/au11UCF/DIG3563

-- A good way to ask "where am I" in the file hierarchy.("pathway describe:")

How to delete a file:

>rm fileb :deletes fileb, usually without warning.

Page 29: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-29 -

Unix Commands

Changing the Protection of Files in Unix:

>ls –la map1.html

-rw-r--rw-@ 1 michaelmoshell admin 1314 Nov 6 11:00 map1.html

110100110 <privileges for "me, my group, strangers"

7 4 6 <octal notation (3 bits) for each privilege group.

>chmod 666 map1.html

>ls –la map1.html

-rw-rw-rw-@ 1 michaelmoshell admin 1314 Nov 6 11:00 map1.html

Page 30: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-30 -

Unix Commands

Important note: When someone accesses your

website (.html, .php, etc.) they are strangers

and your script's privileges are set to that value.

EXERCISE: With the privileges set as above,

would a PHP script be allowed to overwrite

(i. e. to replace) the file map1.html?

-rw-r—r--@ 1 michaelmoshell admin 1314 Nov 6 11:00 map1.html

Page 31: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-31 -

Unix Commands

Important note: When someone accesses your

website (.html, .php, etc.) they are strangers

and your script's privileges are set to that value.

NO because r– means read, not write, not execute.

-rw-r—r--@ 1 michaelmoshell admin 1314 Nov 6 11:00 map1.html

Page 32: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-32 -

Unix Commands

Important note: When someone accesses your

website (.html, .php, etc.) they are strangers

and your script's privileges are set to that value.

EXERCISE: What chmod command would change

the privileges of map1.html so that it could be

overwritten?

-rw-r—r--@ 1 michaelmoshell admin 1314 Nov 6 11:00 map1.html

Page 33: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-33 -

Unix Commands

Important note: When someone accesses your

website (.html, .php, etc.) they are strangers

and your script's privileges are set to that value.

Answer: chmod 666 map1.html

(or anything with 3 digits, last one being 6)

-rw-r—r--@ 1 michaelmoshell admin 1314 Nov 6 11:00 map1.html

Page 34: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-34 -

Unix Commands

Important note: When someone accesses your

website (.html, .php, etc.) they are strangers

and your script's privileges are set to that value.

Answer: chmod 666 map1.html

(or anything with 3 digits, last one being 6)

why? rw-rw-rw- corresponds to 110 110 110

-rw-r—r--@ 1 michaelmoshell admin 1314 Nov 6 11:00 map1.html

Page 35: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-35 -

EXIFtoolEXIF is the digital camera metadata tag standard

EXIFtool works with metadata in MANY file formats.

See Wikipedia article on EXIF http://en.wikipedia.org/wiki/Exchangeable_image_file_format

and then Wikipedia article on EXIFtool

http://en.wikipedia.org/wiki/ExifTool

and then read first article in

http://www.sno.phy.queensu.ca/~phil/exiftool/faq.html

Page 36: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-36 -

EXIFtool(why do I trust this download?)

- download directly from a reputable university site

- well regarded by several external reviews

- site is ** extremely ** alive/active

** but it could still be risky ... if you have a PC,

make SURE your anti-virus is up to date!

Page 37: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-37 -

Experimenting with EXIFtool

Key concepts:

tag name:

has no spaces, e. g. ImageWidth

tag ID:

a machine representation; e. g. 0x0100

description:

a brief English (French etc.) explanation of a tag

Very useful table:

http://www.sno.phy.queensu.ca/~phil/exiftool/TagNamesindex.html

Page 38: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-38 -

EXIFtoolFirst example of use:

>cd

>cd Documents/UCF/au12UCF/teaching/DIG3563/EXIFexper

>exifTool exampleWithGPS.jpg

and much

more....

Page 39: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-39 -

First example of use, continued:

>exifTool exampleWithGPS.jpg

These are descriptions

including

GPS

data

Page 40: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-40 -

To see tag names, use this command:

>exifTool –s exampleWithGPS.jpg

Descr Tag Value

including

GPS

data

Page 41: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-41 -

Now, how can find out how to REMOVE data?

Most Unix tools are self-explaining.

>exiftool

Page 42: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-42 -

Now, how can find out how to REMOVE data?

Most Unix tools are self-explaining.

>exiftool

... but it runs to MANY MANY PAGES!

How to get out? It's an arcane Unix command:

ZZ

sikh-history.com

Page 43: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-43 -

Now, how can find out how to REMOVE data?

Most Unix tools are self-explaining.

>exiftool

... but it runs to MANY MANY PAGES!

How to get out? It's an arcane Unix command:

ZZ

How did I know that?

Experience!

sikh-history.com

Page 44: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-44 -

arcane Unix command:

<<Any time you hear a word from me, if you don't

know what it means, LOOK IT UP.

<< All such words are fair game in the brutal sport

of examination-writing.

Arcane (adjective:) Understood by few;

mysterious or secret.

Page 45: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-45 -

arcane Unix command:

<<Any time you hear a word from me, if you don't

know what it means, LOOK IT UP.

<< All such words are fair game in the brutal sport

of examination-writing.

Here's a couple more arcane Unix commands:

ctrl-C and ctrl-D: These can (sometimes) STOP a

process that is taking too long (or seems stuck.)

Page 46: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-46 -

I put the entire exiftool document on the course

website as a .doc file. (EXIF Documentation)

Searching therein, I found this: The special "All" tag may be used in this syntax only if a VALUE

is NOT given. This causes all meta information to be deleted (or

all information in a group if "-GROUP:All=" is used). Note that

not all groups are deletable. Use the -listd option for a

complete list of deletable groups.

So, I try >exiftool -listd

Page 47: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-47 -

Try this:exiftool –GPS:ALL= gpsvictim.jpg

Now the directory contains

gpsvictim.jpg

gpsvictim.jpg_original

So I have a looksee:

>exiftool gpsvictim.jpg

AND all the GPS metadata is gone!

Page 48: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-48 -

We go back and have a look-see with Adobe Bridge.

Page 49: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-49 -

We go back and have a look-see with Adobe Bridge.

Whereas, the original file shows:

Page 50: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-50 -

ASSIGNMENT: (25 points 'prof points' credit, if you

have no "extra credit" for Website Review.

Install EXIFtool on your laptop*.

Read the documentation

Figure out how to modify ONE metadata tag

Get an image file

Modify one metadata tag

Be prepared to show me, next Tuesday,

what you did.

* (If you have NO laptop, speak to me for alternate)

Page 51: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-51 -

Bottom line: What's to know about metadata removal?

- there are simple tools available. BE CAREFUL

where you get them!

a) reliable, virus-free source

b) make sure it's a live source, for support.

- the powerful tool, exifTool, requires some learning

but can do many things quickly.

Page 52: 1 Information Management DIG 3563 Spring 2012 Lecture 15:Metadata Removal J. Michael Moshell University of Central Florida Original image* by Moshell et

-52 -

Bottom line: What's to know about metadata?

- What is EXIF?

- How to use exifTool to examine and remove metadata

- What is IPTC?

- What is Adobe Bridge? What can/cannot it do?

- What is DICOM? Where might you find it?

- Unix command line, basic commands including

cd, ls, mv, cp, chmod, pwd, and ZZ, ctrl-C, ctrl-D

- Octal numbers for use with chmod