20
Using Multiple Using Multiple Diacritics in Arabic Diacritics in Arabic Scripts for Scripts for Steganography Steganography By By Yousef Salem Elarian Yousef Salem Elarian Aleem Khalid Alvi Aleem Khalid Alvi 1 1

Using Multiple Diacritics in Arabic Scripts for Steganography By Yousef Salem Elarian Aleem Khalid Alvi 1

Embed Size (px)

Citation preview

Page 1: Using Multiple Diacritics in Arabic Scripts for Steganography By Yousef Salem Elarian Aleem Khalid Alvi 1

Using Multiple Diacritics in Using Multiple Diacritics in Arabic Scripts for Arabic Scripts for

SteganographySteganography

ByBy

Yousef Salem ElarianYousef Salem ElarianAleem Khalid AlviAleem Khalid Alvi

11

Page 2: Using Multiple Diacritics in Arabic Scripts for Steganography By Yousef Salem Elarian Aleem Khalid Alvi 1

ContentsContents

IntroductionIntroduction

The classification tree of steganographyThe classification tree of steganography

Characteristics of the Arabic ScriptCharacteristics of the Arabic Script

Related Work Related Work

Approaches Approaches

Evaluation Evaluation

ConclusionConclusion

22

Page 3: Using Multiple Diacritics in Arabic Scripts for Steganography By Yousef Salem Elarian Aleem Khalid Alvi 1

IntroductionIntroduction

Stenography is the approach of hiding Stenography is the approach of hiding

the very existence of secret messages. the very existence of secret messages.

33

Page 4: Using Multiple Diacritics in Arabic Scripts for Steganography By Yousef Salem Elarian Aleem Khalid Alvi 1

The classification tree of The classification tree of steganography steganography

Steganography

Data Linguistic

Images TextAudio Semagrams

Visual Text

Covered ciphers

Open Ciphers

Jargon

Phonetics Misspellings

Figure 1: classification treeclassification tree 44

Page 5: Using Multiple Diacritics in Arabic Scripts for Steganography By Yousef Salem Elarian Aleem Khalid Alvi 1

Characteristics of the Arabic ScriptCharacteristics of the Arabic ScriptThe Arabic alphabet has Semitic origins derived from the The Arabic alphabet has Semitic origins derived from the Aramaic writing system .Aramaic writing system .

Arabic diacritic marks decorate consonant letters to Arabic diacritic marks decorate consonant letters to specify (short) vowels.specify (short) vowels.

Dots and connectivity are two inherent characteristics of Dots and connectivity are two inherent characteristics of Arabic characters. Arabic characters.

Arabic basic alphabet of 28 letters, 15 have from one to Arabic basic alphabet of 28 letters, 15 have from one to three points, four letters can have a Hamzah, and one, three points, four letters can have a Hamzah, and one, ALEF, can be adorned by the elongation stroke, the ALEF, can be adorned by the elongation stroke, the Maddah Maddah

55

Page 6: Using Multiple Diacritics in Arabic Scripts for Steganography By Yousef Salem Elarian Aleem Khalid Alvi 1

contd..contd..Little has been proposed for Arabic script steganography. Little has been proposed for Arabic script steganography.

Two inherent properties of Arabic writing: dots and Two inherent properties of Arabic writing: dots and connectability. connectability.

Some work needs new fonts. Other introduces using the Some work needs new fonts. Other introduces using the existent Kashidah, helped with dotted letters, instead.existent Kashidah, helped with dotted letters, instead.

Hamzated characters can also be used as along with Hamzated characters can also be used as along with dotted ones. dotted ones.

Figure 2. Arabic diacritic marks66

Page 7: Using Multiple Diacritics in Arabic Scripts for Steganography By Yousef Salem Elarian Aleem Khalid Alvi 1

The main idea: DiacriticsThe main idea: Diacritics

Notice the differences in levels of Notice the differences in levels of darkness (as to the right) or colors darkness (as to the right) or colors (as below) in the single and the (as below) in the single and the repeated diacritics.repeated diacritics.

77

Page 8: Using Multiple Diacritics in Arabic Scripts for Steganography By Yousef Salem Elarian Aleem Khalid Alvi 1

ScenariosScenarios

11stst scenario ( scenario (Secret = 110001 Secret = 110001 ))– Direct Direct (block size = inf.)(block size = inf.)

For each block bFor each block bii = n = ndd

Repeat the iRepeat the ithth diacritic n diacritic nd d times.times.

Repeat the 1Repeat the 1stst diacritic 49 extra times! diacritic 49 extra times!– Blocked Blocked (block size = 2)(block size = 2)

Repeat the 1Repeat the 1stst diacritic 3 extra times (3 = ( diacritic 3 extra times (3 = (1111))bb); the ); the

22ndnd one, 0 extra times (0 = ( one, 0 extra times (0 = (0000))bb); and); and

the 3the 3rdrd one, 1 extra time (1 = ( one, 1 extra time (1 = (0101))bb).).

88

Page 9: Using Multiple Diacritics in Arabic Scripts for Steganography By Yousef Salem Elarian Aleem Khalid Alvi 1

Scenarios Scenarios CntdCntd

RLERLE While(secret!=EOF & cover!=EOFWhile(secret!=EOF & cover!=EOF

b = b^b = b^

While(secret.b = b)While(secret.b = b)Type diacriticType diacritic

– For For Secret = 110001Secret = 110001

Repeat the 1Repeat the 1stst diacritic 2 times (1’s in ( diacritic 2 times (1’s in (1111))bb); );

the 2the 2ndnd one, 3 times (0’s in ( one, 3 times (0’s in (000000))bb); and); and

the 3the 3rdrd one, 1 time (for one, 1 time (for 11).).

99

Page 10: Using Multiple Diacritics in Arabic Scripts for Steganography By Yousef Salem Elarian Aleem Khalid Alvi 1

Related WorkRelated Work

Shirali-Shahreza: Shirali-Shahreza: – The position of dots is changed to render The position of dots is changed to render

robust, yet hidden, information. The robust, yet hidden, information. The method needs special fonts.method needs special fonts.

Gutub: Gutub: – Secret-bit hiding after dotted letters by Secret-bit hiding after dotted letters by

inserting Kashidah’s. A small drop in inserting Kashidah’s. A small drop in capacity occurs due to restriction of capacity occurs due to restriction of script on Kashidah and due to the extra-script on Kashidah and due to the extra-Kashidahs.Kashidahs.

Aabed et al.:Aabed et al.:– Redundancy in diacritics is used to hide Redundancy in diacritics is used to hide

information by omitting some diacritics.information by omitting some diacritics.

1010

Page 11: Using Multiple Diacritics in Arabic Scripts for Steganography By Yousef Salem Elarian Aleem Khalid Alvi 1

The textual approach The textual approach

The textual approach chooses a font that hides The textual approach chooses a font that hides extra (or maybe all) diacritic marks completely. extra (or maybe all) diacritic marks completely.

It uses any encoding scenario to hide secret bits It uses any encoding scenario to hide secret bits in an arbitrary number of repeated but invisible in an arbitrary number of repeated but invisible diacritics. diacritics.

A softcopy of the file is needed to retrieve the A softcopy of the file is needed to retrieve the hidden information (by special software or simply hidden information (by special software or simply by changing the font).by changing the font).

1111

Page 12: Using Multiple Diacritics in Arabic Scripts for Steganography By Yousef Salem Elarian Aleem Khalid Alvi 1

contd..contd..

ScenarioScenario ApproachApproach Extra Extra diacriticsdiacritics

11stst scenario scenario (stream)(stream)

Repeat the first diacritic 49 times. (49 = Repeat the first diacritic 49 times. (49 = ((110001110001))bb).).

49.49.

11stst scenario scenario

block size=2block size=2Repeat the first diacritic 3 times (3 = (Repeat the first diacritic 3 times (3 = (1111))bb), the ), the

second one 0 times (0 = (second one 0 times (0 = (0000))bb), and the third one ), and the third one

1 time (1 = (1 time (1 = (0101))bb).).

3 + 0 + 1 = 3 + 0 + 1 = 4.4.

22ndnd scenario scenario (RLEstart=1) (RLEstart=1)

Repeat the first diacritic 2 times (2 = number of Repeat the first diacritic 2 times (2 = number of 1’s in (1’s in (1111))bb), the second one 3 times (3 = number ), the second one 3 times (3 = number

of 0’s in (of 0’s in (000000))bb), and the third one 1 time (for ), and the third one 1 time (for 11).).

(2-1) + (3-1) (2-1) + (3-1) + (1-1) = 3.+ (1-1) = 3.

Table 1. The encodings of the binary value 110001 according to the two scenarios of the first approach

1212

Page 13: Using Multiple Diacritics in Arabic Scripts for Steganography By Yousef Salem Elarian Aleem Khalid Alvi 1

The image approach,The image approach,

The image approach selects one of the fonts The image approach selects one of the fonts that slightly darken multiple occurrences of that slightly darken multiple occurrences of diacritics.diacritics.

This approach needs to convert the document This approach needs to convert the document into image form to survive printing. into image form to survive printing.

This unfortunate fact reduces the possible This unfortunate fact reduces the possible number of repetition of a diacritic to the one that number of repetition of a diacritic to the one that can survive a printing and scanning process (up can survive a printing and scanning process (up to 4 as the last two columns of the first diacritic to 4 as the last two columns of the first diacritic

1313

Page 14: Using Multiple Diacritics in Arabic Scripts for Steganography By Yousef Salem Elarian Aleem Khalid Alvi 1

Approaches cont..Approaches cont..

Figure 4: The degree of brightness of the diacritic marks repeated 1, 2, 3, 4 and 5 times each

1414

Page 15: Using Multiple Diacritics in Arabic Scripts for Steganography By Yousef Salem Elarian Aleem Khalid Alvi 1

ApproachApproach CapacityCapacity robustnessrobustness securitysecurity

Text + Text + softcopysoftcopy

High, up to infinity in 1High, up to infinity in 1stst scenarioscenario

Not robust to Not robust to printingprinting

Invisible, but in Invisible, but in the codethe code

Image + Image + softcopysoftcopy

Very low, due to image Very low, due to image overheadoverhead

Robust to Robust to printingprinting

Slightly visible. Slightly visible. SizeableSizeable

Image + Image + hardcopyhardcopy

Moderate, 1Moderate, 1stst scenario, scenario, blocks of 2blocks of 2

Robust to Robust to printingprinting

Slightly visibleSlightly visible

Approaches cont..Approaches cont..

Table 2. Comparison between the two approaches in terms of capacity, robustness and security.

1515

Page 16: Using Multiple Diacritics in Arabic Scripts for Steganography By Yousef Salem Elarian Aleem Khalid Alvi 1

EvaluationEvaluation

Approach p q r (p+r+q)/2

Dots 0.2764 0.4313 0.0300 0.3689

Kashidah-Before 0.2757 0.4296 0.0298 0.3676

Kashidah-After 0.1880 0.2204 0.0028 0.2056

Diacritics 0.3633 0.3633 0 0.3633

Table 3. The ratios of usable characters for hiding both binary levels according to the three studied approaches

1616

Page 17: Using Multiple Diacritics in Arabic Scripts for Steganography By Yousef Salem Elarian Aleem Khalid Alvi 1

ComparisonComparisonDiacritics Diacritics vsvs. Kashidah. Kashidah

Pros:Pros:– While Kashidah suffers from restrictions on its While Kashidah suffers from restrictions on its

insertion, almost every character can bare a insertion, almost every character can bare a diacritic on it. This disadvantage of the diacritic on it. This disadvantage of the Kashidah method becomes severer for the Kashidah method becomes severer for the need of dotted character.need of dotted character.

Cons:Cons:– Diacritics never come alone; but with another Diacritics never come alone; but with another

character character a stable overhead of 2 bytes per a stable overhead of 2 bytes per secret-baring position. secret-baring position.

1717

Page 18: Using Multiple Diacritics in Arabic Scripts for Steganography By Yousef Salem Elarian Aleem Khalid Alvi 1

Comparison Comparison Cntd.Cntd.The AdvantageThe Advantage

The advantage of our work is that each The advantage of our work is that each usable character can bare multiple secret usable character can bare multiple secret bits with 1 character as overhead. bits with 1 character as overhead. Although this same overhead can be Although this same overhead can be claimed in the Kashidah method, it can’t claimed in the Kashidah method, it can’t really be applied for Kashidah becomes really be applied for Kashidah becomes too long and noticeable.too long and noticeable.

1818

Page 19: Using Multiple Diacritics in Arabic Scripts for Steganography By Yousef Salem Elarian Aleem Khalid Alvi 1

ConclusionConclusion

The text and image approaches are discussed which are The text and image approaches are discussed which are used to hide information in Arabic diacritics for used to hide information in Arabic diacritics for steganographic.steganographic.

This paper presents a variety of scenarios that may achieve This paper presents a variety of scenarios that may achieve up to arbitrary capacities. Sometimes tradeoffs between up to arbitrary capacities. Sometimes tradeoffs between capacity, security and robustness imply that a particular capacity, security and robustness imply that a particular scenario should be chosen. scenario should be chosen.

The overhead of using diacritics was, experimentally, shown The overhead of using diacritics was, experimentally, shown very comparable to related works. very comparable to related works.

The advantage of the method is that such overhead The advantage of the method is that such overhead decreases if more than one diacritical secret bit is used at decreases if more than one diacritical secret bit is used at once.once.

1919

Page 20: Using Multiple Diacritics in Arabic Scripts for Steganography By Yousef Salem Elarian Aleem Khalid Alvi 1

The EndThe End

2020