55
Thoughts on the Emergence of Language Runzhe Yang Princeton NLP June 06, 2019

Emergence of Languagerunzhey/demo/EoL.pdf · 2019-06-07 · 1 N. Chomsky, Lectures on Government and Binding: The Pisa Lectures (Walter de Gruyter, Berlin, 1993), Vol. 9. Chomsky:

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Thoughts on the Emergence of Language

Runzhe Yang

Princeton NLP

June 06, 2019

– Foucault and Chomsky Debate on Human Nature

“… we notice that varying individuals with very varied experience in a particular language nevertheless arrive at systems which are very much congruent to one another. The systems that two speakers of English arrive at on the basis of their very different experiences are congruent in the sense that, over an overwhelming range, what one of them says, the other can understand…

… we notice that in a wide range of languages, in fact all that have been studied seriously, there are remarkable limitations on the kind of systems that emerge from the very different kinds of experiences to which people are exposed…

There is only one possible explanation. … A person who knows a language has acquired that knowledge because he approached the learning experience with a very explicit and detailed schematism that tells him what kind of language it is that he is being exposed to… the child must begin with the knowledge, certainly not with the knowledge that he’s hearing English or Dutch or French or something else, but he does start with the knowledge that he’s hearing a human language of a very narrow and explicit type, that permits a very small range of variation…

Noam Chomsky

• Around 6000 languages are spoken around the world:

1 N. Chomsky, Lectures on Government and Binding: The Pisa Lectures (Walter de Gruyter, Berlin, 1993), Vol. 9.

• Given fractured and highly sparse input, how does a child come to learn the precise syntax of one of these many languages?

• One scenario for learning is known as the Principles and Parameters (P&P) theory1

Chomsky: the knowledge that he’s hearing a human language of a very narrow and explicit type?

1 N. Chomsky, Lectures on Government and Binding: The Pisa Lectures (Walter de Gruyter, Berlin, 1993), Vol. 9.

Chomsky: Principles and Parameters (P&P)

• The child is biologically endowed with a general class of grammars, the “principles,” and by exposure to one particular language, fixes its syntax by setting some number of parameters, assumed to be binary.

• For example, the head-directionality parameter controls whether verbs come before or after objects, like English and Japanese, respectively. A vast effort has been devoted to mapping out the possible parameters of human languages.

• The richness of the discovered structure has been used as criticism of the approach: if the child needs to set many parameters, then do these all need to be innate? This would be a heavy evolutionary burden, and a challenge to efficient learning.

Part I: Emergence of Syntax

No. The richness of the language structure is a natural consequence.

• if the child needs to set many parameters, then do these all need to be innate?

Generative Grammars

An alphabeta

Nonterminal symbols

Terminal symbols

XH<latexit sha1_base64="I9lxIY1ib//AZLRgeEGh8lSJfvE=">AAACT3icfZFNSxxBEIZ7Vo3rxMSPHHNpXASRzTKjgjmK5rAXiYGsLjjLUtNbuzb2F909wWXYv+FV/1KO+SW5BXsmA4kRUtDwdPXbVdVv50Zw55PkR9RaWl55tdpei1+vv3m7sbm1fel0YRkOmBbaDnNwKLjCgede4NBYBJkLvMpvz6rzq29oHdfqq58bHEmYKT7lDHxIZZkEf8NA0OG4P97sJL2kDvoS0gY6pImL8Vb0IZtoVkhUnglw7jpNjB+VYD1nAhdxVjg0wG5hhtcBFUh0o7IeekF3Q2ZCp9qGpTyts3/fKEG6arouDVBJXE1uLvMuzWW90UaFQpXqeS8//TgquTKFR8V+t5oWgnpNKw/ohFtkXsxpnH3CMLnF81Dis0ELXtv9MgM7k3C3CC+Z0axLK/6flKs/0sDxbugAzPJgA2U3YIH58AVxMDj9186XcHnQSw97B1+OOienjdVt8p7skD2SkmNyQvrkggwII4bckwfyGH2Pfka/Wo20FTXwjjyL1toTQ2iwwQ==</latexit>

X<latexit sha1_base64="HsV876l9nQ2KUe0amBP7jKJ/mOQ=">AAACTXicfZFLSyQxEMfTo+uj3fV59BIcBJHZoVuF3aOoBy+igqOD9iDVmZoxmEeTpJcdmvkWXvUrefaDeBMx3Q74AgsCv1T+lar8k2aCWxdFD0FtbPzHxOTUdDjz89fs3PzC4qnVuWHYYlpo007BouAKW447ge3MIMhU4Fl6vVuen/1DY7lWJ26QYUdCX/EeZ+B86jyR4K4YCNq+nK9HzagK+hXiEdTJKI4uF4LfSVezXKJyTIC1F3GUuU4BxnEmcBgmucUM2DX08cKjAom2U1QjD+mqz3RpTxu/lKNV9n1FAdKWszWoh1JiK7IDmTZoKquNzpS/qFR97OV6fzsFV1nuULHXVr1cUKdp6QDtcoPMiQENkz30kxs88FccZmjAabNeJGD6Ev4P/Uv6NGnQkr+TcvUm9Ryu+g7ADPc2UHYFBpjzHxB6g+PPdn6F041mvNncON6qb++MrJ4iy2SFrJGY/CHbZJ8ckRZhRJEbckvugvvgMXgKnl+ltWBUs0Q+RG3yBdinsQU=</latexit>

XO<latexit sha1_base64="Pw1PGsiqRPS2ILW533VzeeTyToU=">AAACT3icfZFNaxsxEIa1TtPY27T56LEXURMIwTW7aSE9hqSHXkpSiBND1phZeeyI6AtJW2IW/41ck7/UY39JbyHa7UK+oAOCR6NXM6NXuRHc+ST5E7WWXi2/Xml34jerb9+trW9snjpdWIYDpoW2wxwcCq5w4LkXODQWQeYCz/LLw+r87Bdax7U68XODIwkzxaecgQ+pLJPgLxgIOhwfjde7ST+pg76EtIEuaeJ4vBF9yiaaFRKVZwKcO08T40clWM+ZwEWcFQ4NsEuY4XlABRLdqKyHXtCtkJnQqbZhKU/r7OMbJUhXTdejASqJq8nNZd6juaw32qhQqFI97eWnX0clV6bwqNi/VtNCUK9p5QGdcIvMizmNs28YJrf4I5Q4MmjBa7tTZmBnEq4W4SUzmvVoxf+TcvUgDRxvhQ7ALA82UHYBFpgPXxAHg9Pndr6E091++rm/+/NLd/+gsbpNPpCPZJukZI/sk+/kmAwII4ZckxtyG/2O/kZ3rUbaihp4T55Eq3MPUIiwyA==</latexit>

8><

>:<latexit sha1_base64="c2kkizIMGc7WO1HYE+9UoRMZzkM=">AAACZHicfVFdSxtBFJ1saxtXrUnFJ6EdDIJIDLtWaB+l7UNfpApGBTeEu5ObzeB8LDOz0rDksb/GV/tj+gf6Ozq7LtQP6IWBM2fOvWfmTJoLbl0U/W4FL14uvXrdXg5XVtferHe6b8+tLgzDIdNCm8sULAqucOi4E3iZGwSZCrxIr79U5xc3aCzX6szNcxxJyBSfcgbOU+PO+yTFjKuS+Rl2ESZVhQmqScOMO71oENVFn4O4AT3S1Mm429pPJpoVEpVjAqy9iqPcjUowjjOB3qKwmAO7hgyvPFQg0Y7K+iULuuOZCZ1q45dytGYfdpQgrQQ361MPKomtkZ3LtE9TWW90rvygSvXYy00/jUqu8sKhYvdW00JQp2kVDJ1wg8yJOQ2Tr+hvbvDYj/ieowGnzV6ZgMkk/Fj4l2Q06dMK/0/K1T+px+GOdwBmuI+BshkYYM7/S+gDjp/G+RycHwziD4OD08Pe0ecm6jbZIttkl8TkIzki38gJGRJGfpJbckd+tf4Eq8FGsHkvDVpNzwZ5VMG7vwhMtz8=</latexit>

Generative Grammars

An alphabeta

Nonterminal symbols

Terminal symbols

XH<latexit sha1_base64="I9lxIY1ib//AZLRgeEGh8lSJfvE=">AAACT3icfZFNSxxBEIZ7Vo3rxMSPHHNpXASRzTKjgjmK5rAXiYGsLjjLUtNbuzb2F909wWXYv+FV/1KO+SW5BXsmA4kRUtDwdPXbVdVv50Zw55PkR9RaWl55tdpei1+vv3m7sbm1fel0YRkOmBbaDnNwKLjCgede4NBYBJkLvMpvz6rzq29oHdfqq58bHEmYKT7lDHxIZZkEf8NA0OG4P97sJL2kDvoS0gY6pImL8Vb0IZtoVkhUnglw7jpNjB+VYD1nAhdxVjg0wG5hhtcBFUh0o7IeekF3Q2ZCp9qGpTyts3/fKEG6arouDVBJXE1uLvMuzWW90UaFQpXqeS8//TgquTKFR8V+t5oWgnpNKw/ohFtkXsxpnH3CMLnF81Dis0ELXtv9MgM7k3C3CC+Z0axLK/6flKs/0sDxbugAzPJgA2U3YIH58AVxMDj9186XcHnQSw97B1+OOienjdVt8p7skD2SkmNyQvrkggwII4bckwfyGH2Pfka/Wo20FTXwjjyL1toTQ2iwwQ==</latexit>

X<latexit sha1_base64="HsV876l9nQ2KUe0amBP7jKJ/mOQ=">AAACTXicfZFLSyQxEMfTo+uj3fV59BIcBJHZoVuF3aOoBy+igqOD9iDVmZoxmEeTpJcdmvkWXvUrefaDeBMx3Q74AgsCv1T+lar8k2aCWxdFD0FtbPzHxOTUdDjz89fs3PzC4qnVuWHYYlpo007BouAKW447ge3MIMhU4Fl6vVuen/1DY7lWJ26QYUdCX/EeZ+B86jyR4K4YCNq+nK9HzagK+hXiEdTJKI4uF4LfSVezXKJyTIC1F3GUuU4BxnEmcBgmucUM2DX08cKjAom2U1QjD+mqz3RpTxu/lKNV9n1FAdKWszWoh1JiK7IDmTZoKquNzpS/qFR97OV6fzsFV1nuULHXVr1cUKdp6QDtcoPMiQENkz30kxs88FccZmjAabNeJGD6Ev4P/Uv6NGnQkr+TcvUm9Ryu+g7ADPc2UHYFBpjzHxB6g+PPdn6F041mvNncON6qb++MrJ4iy2SFrJGY/CHbZJ8ckRZhRJEbckvugvvgMXgKnl+ltWBUs0Q+RG3yBdinsQU=</latexit>

XO<latexit sha1_base64="Pw1PGsiqRPS2ILW533VzeeTyToU=">AAACT3icfZFNaxsxEIa1TtPY27T56LEXURMIwTW7aSE9hqSHXkpSiBND1phZeeyI6AtJW2IW/41ck7/UY39JbyHa7UK+oAOCR6NXM6NXuRHc+ST5E7WWXi2/Xml34jerb9+trW9snjpdWIYDpoW2wxwcCq5w4LkXODQWQeYCz/LLw+r87Bdax7U68XODIwkzxaecgQ+pLJPgLxgIOhwfjde7ST+pg76EtIEuaeJ4vBF9yiaaFRKVZwKcO08T40clWM+ZwEWcFQ4NsEuY4XlABRLdqKyHXtCtkJnQqbZhKU/r7OMbJUhXTdejASqJq8nNZd6juaw32qhQqFI97eWnX0clV6bwqNi/VtNCUK9p5QGdcIvMizmNs28YJrf4I5Q4MmjBa7tTZmBnEq4W4SUzmvVoxf+TcvUgDRxvhQ7ALA82UHYBFpgPXxAHg9Pndr6E091++rm/+/NLd/+gsbpNPpCPZJukZI/sk+/kmAwII4ZckxtyG/2O/kZ3rUbaihp4T55Eq3MPUIiwyA==</latexit>

8><

>:<latexit sha1_base64="c2kkizIMGc7WO1HYE+9UoRMZzkM=">AAACZHicfVFdSxtBFJ1saxtXrUnFJ6EdDIJIDLtWaB+l7UNfpApGBTeEu5ObzeB8LDOz0rDksb/GV/tj+gf6Ozq7LtQP6IWBM2fOvWfmTJoLbl0U/W4FL14uvXrdXg5XVtferHe6b8+tLgzDIdNCm8sULAqucOi4E3iZGwSZCrxIr79U5xc3aCzX6szNcxxJyBSfcgbOU+PO+yTFjKuS+Rl2ESZVhQmqScOMO71oENVFn4O4AT3S1Mm429pPJpoVEpVjAqy9iqPcjUowjjOB3qKwmAO7hgyvPFQg0Y7K+iULuuOZCZ1q45dytGYfdpQgrQQ361MPKomtkZ3LtE9TWW90rvygSvXYy00/jUqu8sKhYvdW00JQp2kVDJ1wg8yJOQ2Tr+hvbvDYj/ieowGnzV6ZgMkk/Fj4l2Q06dMK/0/K1T+px+GOdwBmuI+BshkYYM7/S+gDjp/G+RycHwziD4OD08Pe0ecm6jbZIttkl8TkIzki38gJGRJGfpJbckd+tf4Eq8FGsHkvDVpNzwZ5VMG7vwhMtz8=</latexit>

|XH | = N<latexit sha1_base64="i/i2hAVvlgu8we4Hh4dOh/+PaK0=">AAACTHicfZDLSiNBFIarM46Xdi5elm4KgzAMmdDtCLoRZHThxhtMNIMdwunKSSysS1NVLYY2T+FWX8m97+FOBKvbwIwjeKDgq1P/udSfZoJbF0X3Qe3DxMfJqemZcPbT5y9f5+YXjq3ODcMW00KbdgoWBVfYctwJbGcGQaYCT9Lz7fL95AKN5Vr9dsMMOxIGivc5A+dTf67a3d0rukn3u3P1qBlVQd9CPIY6Gcdhdz74kfQ0yyUqxwRYexpHmesUYBxnAkdhklvMgJ3DAE89KpBoO0W18Yiu+EyP9rXxRzlaZf+tKEBaCe6sQT2UEluRHcq0QVNZXXSmfKNS9XqW6290Cq6y3KFiL6P6uaBO09IA2uMGmRNDGiY76Dc3uOdbHGRowGnzvUjADCRcjvxPBjRp0JLfk3L1V+o5XPETgBnubaDsDAww5/0PvcHx/3a+hePVZvyzuXq0Vt/6NbZ6miyRZfKNxGSdbJFdckhahBFJrskNuQ3ugofgMXh6kdaCcc0ieRW1yWfz27AP</latexit>

|XO| = T<latexit sha1_base64="aRRutV25ULGQ5G+FpI/yGbYBQTc=">AAACTHicfZBbSxtBFMdn02p1vdtHXwaDIBLDrhb0pSBtH3wRFYxG3BDOTk6SwbksM7PFsOZT9LX9Sn3v9/BNBGfXQL2ABwZ+c+Z/LvNPM8Gti6J/Qe3Dx6npTzOz4dz8wuLS8srqudW5YdhiWmjTTsGi4ApbjjuB7cwgyFTgRXr9vXy/+InGcq3O3CjDjoSB4n3OwPnU5W27e3xLv9Kz7nI9akZV0LcQT6BOJnHSXQm2k55muUTlmABrr+Ioc50CjONM4DhMcosZsGsY4JVHBRJtp6g2HtMNn+nRvjb+KEer7POKAqSV4IYN6qGU2IrsSKYNmsrqojPlG5Wql7Ncf79TcJXlDhV7GtXPBXWalgbQHjfInBjRMPmBfnODR77FcYYGnDZbRQJmIOFm7H8yoEmDlvyelKv/Us/hhp8AzHBvA2VDMMCc9z/0Bsev7XwL5zvNeLe5c/qlfvBtYvUMWSPrZJPEZI8ckENyQlqEEUl+kd/kT/A3uAvug4cnaS2Y1HwmL6I2/QgMbbAc</latexit>

observable

hidden

Generative Grammars

An alphabeta

A set of rules R<latexit sha1_base64="yerjRtDf53B0zoqrEFGGi3EhqwQ=">AAACTXicfZFNaxsxEIa1zpe7TdIkPfYiagylOGY3DSRHk+bQS6lT6iTUa8KsPHZE9LFI2hCz+F/0mv6lnvNDcisl2rUhdQ0dEDwavaMZvUozwa2LooegtrK6tr5RfxG+3NzafrWzu3dudW4Y9pgW2lymYFFwhT3HncDLzCDIVOBFevOxPL+4RWO5Vt/cJMOBhLHiI87A+dT3RIK7ZiDo16udRtSOqqDLEM+hQebRvdoN9pOhZrlE5ZgAa/txlLlBAcZxJnAaJrnFDNgNjLHvUYFEOyiqkae06TNDOtLGL+Volf27ogBpy9la1EMpsRXZiUxbNJXVRmfKX1SqFnu50fGg4CrLHSo2azXKBXWalg7QITfInJjQMDlFP7nBz/6KLxkacNq8LxIwYwl3U/+SMU1atOT/Sbl6lnoOm74DMMO9DZRdgwHm/AeE3uD4XzuX4fygHX9oH5wdNjonc6vr5A15S96RmByRDvlEuqRHGFHkB7knP4NfwWPwO/gzk9aCec1rshC1jSfNZ7D/</latexit>

Nonterminal symbols

Terminal symbols

XH<latexit sha1_base64="I9lxIY1ib//AZLRgeEGh8lSJfvE=">AAACT3icfZFNSxxBEIZ7Vo3rxMSPHHNpXASRzTKjgjmK5rAXiYGsLjjLUtNbuzb2F909wWXYv+FV/1KO+SW5BXsmA4kRUtDwdPXbVdVv50Zw55PkR9RaWl55tdpei1+vv3m7sbm1fel0YRkOmBbaDnNwKLjCgede4NBYBJkLvMpvz6rzq29oHdfqq58bHEmYKT7lDHxIZZkEf8NA0OG4P97sJL2kDvoS0gY6pImL8Vb0IZtoVkhUnglw7jpNjB+VYD1nAhdxVjg0wG5hhtcBFUh0o7IeekF3Q2ZCp9qGpTyts3/fKEG6arouDVBJXE1uLvMuzWW90UaFQpXqeS8//TgquTKFR8V+t5oWgnpNKw/ohFtkXsxpnH3CMLnF81Dis0ELXtv9MgM7k3C3CC+Z0axLK/6flKs/0sDxbugAzPJgA2U3YIH58AVxMDj9186XcHnQSw97B1+OOienjdVt8p7skD2SkmNyQvrkggwII4bckwfyGH2Pfka/Wo20FTXwjjyL1toTQ2iwwQ==</latexit>

X<latexit sha1_base64="HsV876l9nQ2KUe0amBP7jKJ/mOQ=">AAACTXicfZFLSyQxEMfTo+uj3fV59BIcBJHZoVuF3aOoBy+igqOD9iDVmZoxmEeTpJcdmvkWXvUrefaDeBMx3Q74AgsCv1T+lar8k2aCWxdFD0FtbPzHxOTUdDjz89fs3PzC4qnVuWHYYlpo007BouAKW447ge3MIMhU4Fl6vVuen/1DY7lWJ26QYUdCX/EeZ+B86jyR4K4YCNq+nK9HzagK+hXiEdTJKI4uF4LfSVezXKJyTIC1F3GUuU4BxnEmcBgmucUM2DX08cKjAom2U1QjD+mqz3RpTxu/lKNV9n1FAdKWszWoh1JiK7IDmTZoKquNzpS/qFR97OV6fzsFV1nuULHXVr1cUKdp6QDtcoPMiQENkz30kxs88FccZmjAabNeJGD6Ev4P/Uv6NGnQkr+TcvUm9Ryu+g7ADPc2UHYFBpjzHxB6g+PPdn6F041mvNncON6qb++MrJ4iy2SFrJGY/CHbZJ8ckRZhRJEbckvugvvgMXgKnl+ltWBUs0Q+RG3yBdinsQU=</latexit>

XO<latexit sha1_base64="Pw1PGsiqRPS2ILW533VzeeTyToU=">AAACT3icfZFNaxsxEIa1TtPY27T56LEXURMIwTW7aSE9hqSHXkpSiBND1phZeeyI6AtJW2IW/41ck7/UY39JbyHa7UK+oAOCR6NXM6NXuRHc+ST5E7WWXi2/Xml34jerb9+trW9snjpdWIYDpoW2wxwcCq5w4LkXODQWQeYCz/LLw+r87Bdax7U68XODIwkzxaecgQ+pLJPgLxgIOhwfjde7ST+pg76EtIEuaeJ4vBF9yiaaFRKVZwKcO08T40clWM+ZwEWcFQ4NsEuY4XlABRLdqKyHXtCtkJnQqbZhKU/r7OMbJUhXTdejASqJq8nNZd6juaw32qhQqFI97eWnX0clV6bwqNi/VtNCUK9p5QGdcIvMizmNs28YJrf4I5Q4MmjBa7tTZmBnEq4W4SUzmvVoxf+TcvUgDRxvhQ7ALA82UHYBFpgPXxAHg9Pndr6E091++rm/+/NLd/+gsbpNPpCPZJukZI/sk+/kmAwII4ZckxtyG/2O/kZ3rUbaihp4T55Eq3MPUIiwyA==</latexit>

8><

>:<latexit sha1_base64="c2kkizIMGc7WO1HYE+9UoRMZzkM=">AAACZHicfVFdSxtBFJ1saxtXrUnFJ6EdDIJIDLtWaB+l7UNfpApGBTeEu5ObzeB8LDOz0rDksb/GV/tj+gf6Ozq7LtQP6IWBM2fOvWfmTJoLbl0U/W4FL14uvXrdXg5XVtferHe6b8+tLgzDIdNCm8sULAqucOi4E3iZGwSZCrxIr79U5xc3aCzX6szNcxxJyBSfcgbOU+PO+yTFjKuS+Rl2ESZVhQmqScOMO71oENVFn4O4AT3S1Mm429pPJpoVEpVjAqy9iqPcjUowjjOB3qKwmAO7hgyvPFQg0Y7K+iULuuOZCZ1q45dytGYfdpQgrQQ361MPKomtkZ3LtE9TWW90rvygSvXYy00/jUqu8sKhYvdW00JQp2kVDJ1wg8yJOQ2Tr+hvbvDYj/ieowGnzV6ZgMkk/Fj4l2Q06dMK/0/K1T+px+GOdwBmuI+BshkYYM7/S+gDjp/G+RycHwziD4OD08Pe0ecm6jbZIttkl8TkIzki38gJGRJGfpJbckd+tf4Eq8FGsHkvDVpNzwZ5VMG7vwhMtz8=</latexit>

a1a2 . . . an ! b1b2 . . . bm<latexit sha1_base64="UxH1EOFmQcMVY7I4eiyL6tx7LW8=">AAACb3icfZHdahQxFMezUz/q+LW1F14IElwKWrbLzCrYy6JeeCNWcNtCZxlOsmdnQyfJkJzRLsO+g0/jrb6Gj+EbmJmuaC14IPDLyf+ck/wjqlJ5SpIfvWjj2vUbNzdvxbfv3L13v7/14Mjb2kmcSFtadyLAY6kMTkhRiSeVQ9CixGNx9ro9P/6EzitrPtKywqmGwqi5kkAhlfd3IU8hH2czS55DbjKnigWBc/YzF3kqfh+JXOf9QTJKuuBXIV3DgK3jMN/q7YViWWs0JEvw/jRNKpo24EjJEldxVnusQJ5BgacBDWj006Z71IrvhMyMz60LyxDvsn9XNKC9BloMeYBW4jvySy2GXOhuYysTGrWqy7Novj9tlKlqQiMvRs3rkpPlrUd8phxKKpc8zt5guLnDd6HF+wodkHW7TQau0HC+Ci8peDbkLf9PqswfaeB4J0wA6VSwgcsFOJAUvigOBqf/2nkVjsaj9Plo/OHF4ODV2upN9og9YU9Zyl6yA/aWHbIJk+wL+8q+se+9n9HD6HHEL6RRb12zzS5F9OwXpm27IA==</latexit>

ai 2 XH<latexit sha1_base64="w/WerMBOi+wwvD1FR3eHhtfjbHg=">AAACWXicfZHfShtBFMZn11bTbavR9K43g0EoJQ27Kuil2F7kptRCowE3LGcnJ8ng/FlmZsW45F16275R8WWcXQOtFTww8Jsz35lz5pu8ENy6OP4ThGsvXq5vtF5Fr9+83dxqb++cW10ahkOmhTajHCwKrnDouBM4KgyCzAVe5Fef6/OLazSWa/XDLQocS5gpPuUMnE9l7XeQcZpyRVMJbs5A0FE2yNrduB83QZ9CsoIuWcVZth18SiealRKVYwKsvUziwo0rMI4zgcsoLS0WwK5ghpceFUi046oZf0n3fGZCp9r4pRxtsv9WVCBtPV2PeqgltiG7kHmP5rLZ6EL5i2rV415uejyuuCpKh4o9tJqWgjpNazfohBtkTixolH5BP7nBr/6KbwUacNp8rFIwMwk3S/+SGU17tObnpFz9lXqO9nwHYIZ7GyibgwHm/GdE3uDkfzufwvl+Pzno738/7J6crqxukfdkl3wgCTkiJ2RAzsiQMHJLfpJf5HdwFwZhK4wepGGwqumQRxF27gG3+rHg</latexit>

bi 2 X = XH [ XO<latexit sha1_base64="fZ8Elafod7i0Mgnc2yr+MrsbQrM=">AAACeHicfVHLahsxFJUnfaTTl9Msu7nUhD5wzYxTaDeB0HaRTUkKdWLImOGOfO2IaCQhaULM4A/J13TbfkJ/patqJoY6DfSC4Nyjcx86KowUzifJr060cefuvfubD+KHjx4/edrdenbsdGU5jbiW2o4LdCSFopEXXtLYWMKykHRSnH9q7k8uyDqh1Te/MDQpca7ETHD0gcq7u0UuIBMKshL9GUcJY9hbS/IDyHhl1pnDvNtLBkkbcBukK9BjqzjKtzpvs6nmVUnKc4nOnaaJ8ZMarRdc0jLOKkcG+TnO6TRAhSW5Sd2+bgk7gZnCTNtwlIeWXa+osXTNdn0IoJG4FrlFWfShKNtEGxUaNaqbs/zsw6QWylSeFL8eNaskeA2NWTAVlriXC4izzxQ2t/QltDg0ZNFr+6bO0M5LvFyGl8wh60OD/ycV6q804HgnTEBuRbAB+Bla5D78VRwMTv+18zY4Hg7S3cHw67ve/seV1ZvsOXvBXrGUvWf77IAdsRHj7Ip9Zz/Yz87vCKKX0etradRZ1WyzGxEN/wBAwL2C</latexit>

|XH | = N<latexit sha1_base64="i/i2hAVvlgu8we4Hh4dOh/+PaK0=">AAACTHicfZDLSiNBFIarM46Xdi5elm4KgzAMmdDtCLoRZHThxhtMNIMdwunKSSysS1NVLYY2T+FWX8m97+FOBKvbwIwjeKDgq1P/udSfZoJbF0X3Qe3DxMfJqemZcPbT5y9f5+YXjq3ODcMW00KbdgoWBVfYctwJbGcGQaYCT9Lz7fL95AKN5Vr9dsMMOxIGivc5A+dTf67a3d0rukn3u3P1qBlVQd9CPIY6Gcdhdz74kfQ0yyUqxwRYexpHmesUYBxnAkdhklvMgJ3DAE89KpBoO0W18Yiu+EyP9rXxRzlaZf+tKEBaCe6sQT2UEluRHcq0QVNZXXSmfKNS9XqW6290Cq6y3KFiL6P6uaBO09IA2uMGmRNDGiY76Dc3uOdbHGRowGnzvUjADCRcjvxPBjRp0JLfk3L1V+o5XPETgBnubaDsDAww5/0PvcHx/3a+hePVZvyzuXq0Vt/6NbZ6miyRZfKNxGSdbJFdckhahBFJrskNuQ3ugofgMXh6kdaCcc0ieRW1yWfz27AP</latexit>

|XO| = T<latexit sha1_base64="aRRutV25ULGQ5G+FpI/yGbYBQTc=">AAACTHicfZBbSxtBFMdn02p1vdtHXwaDIBLDrhb0pSBtH3wRFYxG3BDOTk6SwbksM7PFsOZT9LX9Sn3v9/BNBGfXQL2ABwZ+c+Z/LvNPM8Gti6J/Qe3Dx6npTzOz4dz8wuLS8srqudW5YdhiWmjTTsGi4ApbjjuB7cwgyFTgRXr9vXy/+InGcq3O3CjDjoSB4n3OwPnU5W27e3xLv9Kz7nI9akZV0LcQT6BOJnHSXQm2k55muUTlmABrr+Ioc50CjONM4DhMcosZsGsY4JVHBRJtp6g2HtMNn+nRvjb+KEer7POKAqSV4IYN6qGU2IrsSKYNmsrqojPlG5Wql7Ncf79TcJXlDhV7GtXPBXWalgbQHjfInBjRMPmBfnODR77FcYYGnDZbRQJmIOFm7H8yoEmDlvyelKv/Us/hhp8AzHBvA2VDMMCc9z/0Bsev7XwL5zvNeLe5c/qlfvBtYvUMWSPrZJPEZI8ckENyQlqEEUl+kd/kT/A3uAvug4cnaS2Y1HwmL6I2/QgMbbAc</latexit>

observable

hidden

Generative Grammars

An alphabeta

A set of rules R<latexit sha1_base64="yerjRtDf53B0zoqrEFGGi3EhqwQ=">AAACTXicfZFNaxsxEIa1zpe7TdIkPfYiagylOGY3DSRHk+bQS6lT6iTUa8KsPHZE9LFI2hCz+F/0mv6lnvNDcisl2rUhdQ0dEDwavaMZvUozwa2LooegtrK6tr5RfxG+3NzafrWzu3dudW4Y9pgW2lymYFFwhT3HncDLzCDIVOBFevOxPL+4RWO5Vt/cJMOBhLHiI87A+dT3RIK7ZiDo16udRtSOqqDLEM+hQebRvdoN9pOhZrlE5ZgAa/txlLlBAcZxJnAaJrnFDNgNjLHvUYFEOyiqkae06TNDOtLGL+Volf27ogBpy9la1EMpsRXZiUxbNJXVRmfKX1SqFnu50fGg4CrLHSo2azXKBXWalg7QITfInJjQMDlFP7nBz/6KLxkacNq8LxIwYwl3U/+SMU1atOT/Sbl6lnoOm74DMMO9DZRdgwHm/AeE3uD4XzuX4fygHX9oH5wdNjonc6vr5A15S96RmByRDvlEuqRHGFHkB7knP4NfwWPwO/gzk9aCec1rshC1jSfNZ7D/</latexit>

Nonterminal symbols

Terminal symbols

XH<latexit sha1_base64="I9lxIY1ib//AZLRgeEGh8lSJfvE=">AAACT3icfZFNSxxBEIZ7Vo3rxMSPHHNpXASRzTKjgjmK5rAXiYGsLjjLUtNbuzb2F909wWXYv+FV/1KO+SW5BXsmA4kRUtDwdPXbVdVv50Zw55PkR9RaWl55tdpei1+vv3m7sbm1fel0YRkOmBbaDnNwKLjCgede4NBYBJkLvMpvz6rzq29oHdfqq58bHEmYKT7lDHxIZZkEf8NA0OG4P97sJL2kDvoS0gY6pImL8Vb0IZtoVkhUnglw7jpNjB+VYD1nAhdxVjg0wG5hhtcBFUh0o7IeekF3Q2ZCp9qGpTyts3/fKEG6arouDVBJXE1uLvMuzWW90UaFQpXqeS8//TgquTKFR8V+t5oWgnpNKw/ohFtkXsxpnH3CMLnF81Dis0ELXtv9MgM7k3C3CC+Z0axLK/6flKs/0sDxbugAzPJgA2U3YIH58AVxMDj9186XcHnQSw97B1+OOienjdVt8p7skD2SkmNyQvrkggwII4bckwfyGH2Pfka/Wo20FTXwjjyL1toTQ2iwwQ==</latexit>

X<latexit sha1_base64="HsV876l9nQ2KUe0amBP7jKJ/mOQ=">AAACTXicfZFLSyQxEMfTo+uj3fV59BIcBJHZoVuF3aOoBy+igqOD9iDVmZoxmEeTpJcdmvkWXvUrefaDeBMx3Q74AgsCv1T+lar8k2aCWxdFD0FtbPzHxOTUdDjz89fs3PzC4qnVuWHYYlpo007BouAKW447ge3MIMhU4Fl6vVuen/1DY7lWJ26QYUdCX/EeZ+B86jyR4K4YCNq+nK9HzagK+hXiEdTJKI4uF4LfSVezXKJyTIC1F3GUuU4BxnEmcBgmucUM2DX08cKjAom2U1QjD+mqz3RpTxu/lKNV9n1FAdKWszWoh1JiK7IDmTZoKquNzpS/qFR97OV6fzsFV1nuULHXVr1cUKdp6QDtcoPMiQENkz30kxs88FccZmjAabNeJGD6Ev4P/Uv6NGnQkr+TcvUm9Ryu+g7ADPc2UHYFBpjzHxB6g+PPdn6F041mvNncON6qb++MrJ4iy2SFrJGY/CHbZJ8ckRZhRJEbckvugvvgMXgKnl+ltWBUs0Q+RG3yBdinsQU=</latexit>

XO<latexit sha1_base64="Pw1PGsiqRPS2ILW533VzeeTyToU=">AAACT3icfZFNaxsxEIa1TtPY27T56LEXURMIwTW7aSE9hqSHXkpSiBND1phZeeyI6AtJW2IW/41ck7/UY39JbyHa7UK+oAOCR6NXM6NXuRHc+ST5E7WWXi2/Xml34jerb9+trW9snjpdWIYDpoW2wxwcCq5w4LkXODQWQeYCz/LLw+r87Bdax7U68XODIwkzxaecgQ+pLJPgLxgIOhwfjde7ST+pg76EtIEuaeJ4vBF9yiaaFRKVZwKcO08T40clWM+ZwEWcFQ4NsEuY4XlABRLdqKyHXtCtkJnQqbZhKU/r7OMbJUhXTdejASqJq8nNZd6juaw32qhQqFI97eWnX0clV6bwqNi/VtNCUK9p5QGdcIvMizmNs28YJrf4I5Q4MmjBa7tTZmBnEq4W4SUzmvVoxf+TcvUgDRxvhQ7ALA82UHYBFpgPXxAHg9Pndr6E091++rm/+/NLd/+gsbpNPpCPZJukZI/sk+/kmAwII4ZckxtyG/2O/kZ3rUbaihp4T55Eq3MPUIiwyA==</latexit>

8><

>:<latexit sha1_base64="c2kkizIMGc7WO1HYE+9UoRMZzkM=">AAACZHicfVFdSxtBFJ1saxtXrUnFJ6EdDIJIDLtWaB+l7UNfpApGBTeEu5ObzeB8LDOz0rDksb/GV/tj+gf6Ozq7LtQP6IWBM2fOvWfmTJoLbl0U/W4FL14uvXrdXg5XVtferHe6b8+tLgzDIdNCm8sULAqucOi4E3iZGwSZCrxIr79U5xc3aCzX6szNcxxJyBSfcgbOU+PO+yTFjKuS+Rl2ESZVhQmqScOMO71oENVFn4O4AT3S1Mm429pPJpoVEpVjAqy9iqPcjUowjjOB3qKwmAO7hgyvPFQg0Y7K+iULuuOZCZ1q45dytGYfdpQgrQQ361MPKomtkZ3LtE9TWW90rvygSvXYy00/jUqu8sKhYvdW00JQp2kVDJ1wg8yJOQ2Tr+hvbvDYj/ieowGnzV6ZgMkk/Fj4l2Q06dMK/0/K1T+px+GOdwBmuI+BshkYYM7/S+gDjp/G+RycHwziD4OD08Pe0ecm6jbZIttkl8TkIzki38gJGRJGfpJbckd+tf4Eq8FGsHkvDVpNzwZ5VMG7vwhMtz8=</latexit>

a1a2 . . . an ! b1b2 . . . bm<latexit sha1_base64="UxH1EOFmQcMVY7I4eiyL6tx7LW8=">AAACb3icfZHdahQxFMezUz/q+LW1F14IElwKWrbLzCrYy6JeeCNWcNtCZxlOsmdnQyfJkJzRLsO+g0/jrb6Gj+EbmJmuaC14IPDLyf+ck/wjqlJ5SpIfvWjj2vUbNzdvxbfv3L13v7/14Mjb2kmcSFtadyLAY6kMTkhRiSeVQ9CixGNx9ro9P/6EzitrPtKywqmGwqi5kkAhlfd3IU8hH2czS55DbjKnigWBc/YzF3kqfh+JXOf9QTJKuuBXIV3DgK3jMN/q7YViWWs0JEvw/jRNKpo24EjJEldxVnusQJ5BgacBDWj006Z71IrvhMyMz60LyxDvsn9XNKC9BloMeYBW4jvySy2GXOhuYysTGrWqy7Novj9tlKlqQiMvRs3rkpPlrUd8phxKKpc8zt5guLnDd6HF+wodkHW7TQau0HC+Ci8peDbkLf9PqswfaeB4J0wA6VSwgcsFOJAUvigOBqf/2nkVjsaj9Plo/OHF4ODV2upN9og9YU9Zyl6yA/aWHbIJk+wL+8q+se+9n9HD6HHEL6RRb12zzS5F9OwXpm27IA==</latexit>

ai 2 XH<latexit sha1_base64="w/WerMBOi+wwvD1FR3eHhtfjbHg=">AAACWXicfZHfShtBFMZn11bTbavR9K43g0EoJQ27Kuil2F7kptRCowE3LGcnJ8ng/FlmZsW45F16275R8WWcXQOtFTww8Jsz35lz5pu8ENy6OP4ThGsvXq5vtF5Fr9+83dxqb++cW10ahkOmhTajHCwKrnDouBM4KgyCzAVe5Fef6/OLazSWa/XDLQocS5gpPuUMnE9l7XeQcZpyRVMJbs5A0FE2yNrduB83QZ9CsoIuWcVZth18SiealRKVYwKsvUziwo0rMI4zgcsoLS0WwK5ghpceFUi046oZf0n3fGZCp9r4pRxtsv9WVCBtPV2PeqgltiG7kHmP5rLZ6EL5i2rV415uejyuuCpKh4o9tJqWgjpNazfohBtkTixolH5BP7nBr/6KbwUacNp8rFIwMwk3S/+SGU17tObnpFz9lXqO9nwHYIZ7GyibgwHm/GdE3uDkfzufwvl+Pzno738/7J6crqxukfdkl3wgCTkiJ2RAzsiQMHJLfpJf5HdwFwZhK4wepGGwqumQRxF27gG3+rHg</latexit>

bi 2 X = XH [ XO<latexit sha1_base64="fZ8Elafod7i0Mgnc2yr+MrsbQrM=">AAACeHicfVHLahsxFJUnfaTTl9Msu7nUhD5wzYxTaDeB0HaRTUkKdWLImOGOfO2IaCQhaULM4A/J13TbfkJ/patqJoY6DfSC4Nyjcx86KowUzifJr060cefuvfubD+KHjx4/edrdenbsdGU5jbiW2o4LdCSFopEXXtLYWMKykHRSnH9q7k8uyDqh1Te/MDQpca7ETHD0gcq7u0UuIBMKshL9GUcJY9hbS/IDyHhl1pnDvNtLBkkbcBukK9BjqzjKtzpvs6nmVUnKc4nOnaaJ8ZMarRdc0jLOKkcG+TnO6TRAhSW5Sd2+bgk7gZnCTNtwlIeWXa+osXTNdn0IoJG4FrlFWfShKNtEGxUaNaqbs/zsw6QWylSeFL8eNaskeA2NWTAVlriXC4izzxQ2t/QltDg0ZNFr+6bO0M5LvFyGl8wh60OD/ycV6q804HgnTEBuRbAB+Bla5D78VRwMTv+18zY4Hg7S3cHw67ve/seV1ZvsOXvBXrGUvWf77IAdsRHj7Ip9Zz/Yz87vCKKX0etradRZ1WyzGxEN/wBAwL2C</latexit>

|XH | = N<latexit sha1_base64="i/i2hAVvlgu8we4Hh4dOh/+PaK0=">AAACTHicfZDLSiNBFIarM46Xdi5elm4KgzAMmdDtCLoRZHThxhtMNIMdwunKSSysS1NVLYY2T+FWX8m97+FOBKvbwIwjeKDgq1P/udSfZoJbF0X3Qe3DxMfJqemZcPbT5y9f5+YXjq3ODcMW00KbdgoWBVfYctwJbGcGQaYCT9Lz7fL95AKN5Vr9dsMMOxIGivc5A+dTf67a3d0rukn3u3P1qBlVQd9CPIY6Gcdhdz74kfQ0yyUqxwRYexpHmesUYBxnAkdhklvMgJ3DAE89KpBoO0W18Yiu+EyP9rXxRzlaZf+tKEBaCe6sQT2UEluRHcq0QVNZXXSmfKNS9XqW6290Cq6y3KFiL6P6uaBO09IA2uMGmRNDGiY76Dc3uOdbHGRowGnzvUjADCRcjvxPBjRp0JLfk3L1V+o5XPETgBnubaDsDAww5/0PvcHx/3a+hePVZvyzuXq0Vt/6NbZ6miyRZfKNxGSdbJFdckhahBFJrskNuQ3ugofgMXh6kdaCcc0ieRW1yWfz27AP</latexit>

|XO| = T<latexit sha1_base64="aRRutV25ULGQ5G+FpI/yGbYBQTc=">AAACTHicfZBbSxtBFMdn02p1vdtHXwaDIBLDrhb0pSBtH3wRFYxG3BDOTk6SwbksM7PFsOZT9LX9Sn3v9/BNBGfXQL2ABwZ+c+Z/LvNPM8Gti6J/Qe3Dx6npTzOz4dz8wuLS8srqudW5YdhiWmjTTsGi4ApbjjuB7cwgyFTgRXr9vXy/+InGcq3O3CjDjoSB4n3OwPnU5W27e3xLv9Kz7nI9akZV0LcQT6BOJnHSXQm2k55muUTlmABrr+Ioc50CjONM4DhMcosZsGsY4JVHBRJtp6g2HtMNn+nRvjb+KEer7POKAqSV4IYN6qGU2IrsSKYNmsrqojPlG5Wql7Ncf79TcJXlDhV7GtXPBXWalgbQHjfInBjRMPmBfnODR77FcYYGnDZbRQJmIOFm7H8yoEmDlvyelKv/Us/hhp8AzHBvA2VDMMCc9z/0Bsev7XwL5zvNeLe5c/qlfvBtYvUMWSPrZJPEZI8ckENyQlqEEUl+kd/kT/A3uAvug4cnaS2Y1HwmL6I2/QgMbbAc</latexit>

observable

hidden

context

context-free

Context-Free Grammars (CFG)

A set of rules R<latexit sha1_base64="yerjRtDf53B0zoqrEFGGi3EhqwQ=">AAACTXicfZFNaxsxEIa1zpe7TdIkPfYiagylOGY3DSRHk+bQS6lT6iTUa8KsPHZE9LFI2hCz+F/0mv6lnvNDcisl2rUhdQ0dEDwavaMZvUozwa2LooegtrK6tr5RfxG+3NzafrWzu3dudW4Y9pgW2lymYFFwhT3HncDLzCDIVOBFevOxPL+4RWO5Vt/cJMOBhLHiI87A+dT3RIK7ZiDo16udRtSOqqDLEM+hQebRvdoN9pOhZrlE5ZgAa/txlLlBAcZxJnAaJrnFDNgNjLHvUYFEOyiqkae06TNDOtLGL+Volf27ogBpy9la1EMpsRXZiUxbNJXVRmfKX1SqFnu50fGg4CrLHSo2azXKBXWalg7QITfInJjQMDlFP7nBz/6KLxkacNq8LxIwYwl3U/+SMU1atOT/Sbl6lnoOm74DMMO9DZRdgwHm/AeE3uD4XzuX4fygHX9oH5wdNjonc6vr5A15S96RmByRDvlEuqRHGFHkB7knP4NfwWPwO/gzk9aCec1rshC1jSfNZ7D/</latexit>

bi 2 X = XH [ XO<latexit sha1_base64="fZ8Elafod7i0Mgnc2yr+MrsbQrM=">AAACeHicfVHLahsxFJUnfaTTl9Msu7nUhD5wzYxTaDeB0HaRTUkKdWLImOGOfO2IaCQhaULM4A/J13TbfkJ/patqJoY6DfSC4Nyjcx86KowUzifJr060cefuvfubD+KHjx4/edrdenbsdGU5jbiW2o4LdCSFopEXXtLYWMKykHRSnH9q7k8uyDqh1Te/MDQpca7ETHD0gcq7u0UuIBMKshL9GUcJY9hbS/IDyHhl1pnDvNtLBkkbcBukK9BjqzjKtzpvs6nmVUnKc4nOnaaJ8ZMarRdc0jLOKkcG+TnO6TRAhSW5Sd2+bgk7gZnCTNtwlIeWXa+osXTNdn0IoJG4FrlFWfShKNtEGxUaNaqbs/zsw6QWylSeFL8eNaskeA2NWTAVlriXC4izzxQ2t/QltDg0ZNFr+6bO0M5LvFyGl8wh60OD/ycV6q804HgnTEBuRbAB+Bla5D78VRwMTv+18zY4Hg7S3cHw67ve/seV1ZvsOXvBXrGUvWf77IAdsRHj7Ip9Zz/Yz87vCKKX0etradRZ1WyzGxEN/wBAwL2C</latexit>

a1 ! b1b2 . . . bm<latexit sha1_base64="rttTQhthAOi0WC8M7hH7iDCBA5o=">AAACY3icfZHbahRBEIZ7x1OceNhE70RpXAIim2VmFfQyRC+8ESO4SSCzDDW9tbNN+jB016jLsHc+TW6Tl/EBfA97JgsaAxY0fF39V1X330WlpKck+dmLbty8dfvOxt148979Bw/7W9uH3tZO4ERYZd1xAR6VNDghSQqPK4egC4VHxem79vzoKzovrflCywqnGkoj51IAhVTefwZ5mjlZLgics994kadFPs5mlnxgnfcHySjpgl+HdA0Dto6DfKu3G4pFrdGQUOD9SZpUNG3AkRQKV3FWe6xAnEKJJwENaPTTpnvIiu+EzIzPrQvLEO+yf1c0oL0GWgx5gFbiO/JLXQx5obuNrUxo1KquzqL522kjTVUTGnE5al4rTpa3vvCZdChILXmcvcdwc4cfQ4tPFTog6142GbhSw/dVeEnJsyFv+X9Saf5IA8c7YQIIJ4MNXCzAgaDwLXEwOP3XzutwOB6lr0bjz68He/trqzfYE/acvWApe8P22Ad2wCZMsB/sjJ2zi96vaDPajh5fSqPeuuYRuxLR098pH7ZS</latexit>

a1 2 XH<latexit sha1_base64="pxb/IZKY99658nqT22Xb/FHykeg=">AAACWHicfZHfSxwxEMfntlp1a+upj30JPYRSrseuCu2jtD74Umqhpwfusczm5s5gfixJtvRY7m/x1f5J+teYXQ9aKzgQ+GTynczkm6KUwvkkue1EL1ZWX66tb8SvNl+/2epu75w5U1lOQ26ksaMCHUmhaeiFlzQqLaEqJJ0XV1+b8/NfZJ0w+qeflzRWONNiKjj6kMq7u5inmdAsU+gvOUo2yk/ybi8ZJG2wp5AuoQfLOM23Ox+zieGVIu25ROcu0qT04xqtF1zSIs4qRyXyK5zRRUCNity4bqdfsL2QmbCpsWFpz9rsvxU1KtdM12cBGolryc1V0WeFajem1OGiRvW4l59+HtdCl5UnzR9aTSvJvGGNGWwiLHEv5yzOjilMbulbuOJ7SRa9sR/qDO1M4e9FeMmMZX3W8HNSof9KA8d7oQNyK4INjF+iRe7DX8TB4PR/O5/C2f4gPRjs/zjsHX1ZWr0Ob+EdvIcUPsERnMApDIHDHK7hBv507iKI1qKNB2nUWdbswqOIdu4B5r+xfg==</latexit>

An alphabeta

Nonterminal symbols

Terminal symbols

XH<latexit sha1_base64="I9lxIY1ib//AZLRgeEGh8lSJfvE=">AAACT3icfZFNSxxBEIZ7Vo3rxMSPHHNpXASRzTKjgjmK5rAXiYGsLjjLUtNbuzb2F909wWXYv+FV/1KO+SW5BXsmA4kRUtDwdPXbVdVv50Zw55PkR9RaWl55tdpei1+vv3m7sbm1fel0YRkOmBbaDnNwKLjCgede4NBYBJkLvMpvz6rzq29oHdfqq58bHEmYKT7lDHxIZZkEf8NA0OG4P97sJL2kDvoS0gY6pImL8Vb0IZtoVkhUnglw7jpNjB+VYD1nAhdxVjg0wG5hhtcBFUh0o7IeekF3Q2ZCp9qGpTyts3/fKEG6arouDVBJXE1uLvMuzWW90UaFQpXqeS8//TgquTKFR8V+t5oWgnpNKw/ohFtkXsxpnH3CMLnF81Dis0ELXtv9MgM7k3C3CC+Z0axLK/6flKs/0sDxbugAzPJgA2U3YIH58AVxMDj9186XcHnQSw97B1+OOienjdVt8p7skD2SkmNyQvrkggwII4bckwfyGH2Pfka/Wo20FTXwjjyL1toTQ2iwwQ==</latexit>

X<latexit sha1_base64="HsV876l9nQ2KUe0amBP7jKJ/mOQ=">AAACTXicfZFLSyQxEMfTo+uj3fV59BIcBJHZoVuF3aOoBy+igqOD9iDVmZoxmEeTpJcdmvkWXvUrefaDeBMx3Q74AgsCv1T+lar8k2aCWxdFD0FtbPzHxOTUdDjz89fs3PzC4qnVuWHYYlpo007BouAKW447ge3MIMhU4Fl6vVuen/1DY7lWJ26QYUdCX/EeZ+B86jyR4K4YCNq+nK9HzagK+hXiEdTJKI4uF4LfSVezXKJyTIC1F3GUuU4BxnEmcBgmucUM2DX08cKjAom2U1QjD+mqz3RpTxu/lKNV9n1FAdKWszWoh1JiK7IDmTZoKquNzpS/qFR97OV6fzsFV1nuULHXVr1cUKdp6QDtcoPMiQENkz30kxs88FccZmjAabNeJGD6Ev4P/Uv6NGnQkr+TcvUm9Ryu+g7ADPc2UHYFBpjzHxB6g+PPdn6F041mvNncON6qb++MrJ4iy2SFrJGY/CHbZJ8ckRZhRJEbckvugvvgMXgKnl+ltWBUs0Q+RG3yBdinsQU=</latexit>

XO<latexit sha1_base64="Pw1PGsiqRPS2ILW533VzeeTyToU=">AAACT3icfZFNaxsxEIa1TtPY27T56LEXURMIwTW7aSE9hqSHXkpSiBND1phZeeyI6AtJW2IW/41ck7/UY39JbyHa7UK+oAOCR6NXM6NXuRHc+ST5E7WWXi2/Xml34jerb9+trW9snjpdWIYDpoW2wxwcCq5w4LkXODQWQeYCz/LLw+r87Bdax7U68XODIwkzxaecgQ+pLJPgLxgIOhwfjde7ST+pg76EtIEuaeJ4vBF9yiaaFRKVZwKcO08T40clWM+ZwEWcFQ4NsEuY4XlABRLdqKyHXtCtkJnQqbZhKU/r7OMbJUhXTdejASqJq8nNZd6juaw32qhQqFI97eWnX0clV6bwqNi/VtNCUK9p5QGdcIvMizmNs28YJrf4I5Q4MmjBa7tTZmBnEq4W4SUzmvVoxf+TcvUgDRxvhQ7ALA82UHYBFpgPXxAHg9Pndr6E091++rm/+/NLd/+gsbpNPpCPZJukZI/sk+/kmAwII4ZckxtyG/2O/kZ3rUbaihp4T55Eq3MPUIiwyA==</latexit>

8><

>:<latexit sha1_base64="c2kkizIMGc7WO1HYE+9UoRMZzkM=">AAACZHicfVFdSxtBFJ1saxtXrUnFJ6EdDIJIDLtWaB+l7UNfpApGBTeEu5ObzeB8LDOz0rDksb/GV/tj+gf6Ozq7LtQP6IWBM2fOvWfmTJoLbl0U/W4FL14uvXrdXg5XVtferHe6b8+tLgzDIdNCm8sULAqucOi4E3iZGwSZCrxIr79U5xc3aCzX6szNcxxJyBSfcgbOU+PO+yTFjKuS+Rl2ESZVhQmqScOMO71oENVFn4O4AT3S1Mm429pPJpoVEpVjAqy9iqPcjUowjjOB3qKwmAO7hgyvPFQg0Y7K+iULuuOZCZ1q45dytGYfdpQgrQQ361MPKomtkZ3LtE9TWW90rvygSvXYy00/jUqu8sKhYvdW00JQp2kVDJ1wg8yJOQ2Tr+hvbvDYj/ieowGnzV6ZgMkk/Fj4l2Q06dMK/0/K1T+px+GOdwBmuI+BshkYYM7/S+gDjp/G+RycHwziD4OD08Pe0ecm6jbZIttkl8TkIzki38gJGRJGfpJbckd+tf4Eq8FGsHkvDVpNzwZ5VMG7vwhMtz8=</latexit>

|XH | = N<latexit sha1_base64="i/i2hAVvlgu8we4Hh4dOh/+PaK0=">AAACTHicfZDLSiNBFIarM46Xdi5elm4KgzAMmdDtCLoRZHThxhtMNIMdwunKSSysS1NVLYY2T+FWX8m97+FOBKvbwIwjeKDgq1P/udSfZoJbF0X3Qe3DxMfJqemZcPbT5y9f5+YXjq3ODcMW00KbdgoWBVfYctwJbGcGQaYCT9Lz7fL95AKN5Vr9dsMMOxIGivc5A+dTf67a3d0rukn3u3P1qBlVQd9CPIY6Gcdhdz74kfQ0yyUqxwRYexpHmesUYBxnAkdhklvMgJ3DAE89KpBoO0W18Yiu+EyP9rXxRzlaZf+tKEBaCe6sQT2UEluRHcq0QVNZXXSmfKNS9XqW6290Cq6y3KFiL6P6uaBO09IA2uMGmRNDGiY76Dc3uOdbHGRowGnzvUjADCRcjvxPBjRp0JLfk3L1V+o5XPETgBnubaDsDAww5/0PvcHx/3a+hePVZvyzuXq0Vt/6NbZ6miyRZfKNxGSdbJFdckhahBFJrskNuQ3ugofgMXh6kdaCcc0ieRW1yWfz27AP</latexit>

|XO| = T<latexit sha1_base64="aRRutV25ULGQ5G+FpI/yGbYBQTc=">AAACTHicfZBbSxtBFMdn02p1vdtHXwaDIBLDrhb0pSBtH3wRFYxG3BDOTk6SwbksM7PFsOZT9LX9Sn3v9/BNBGfXQL2ABwZ+c+Z/LvNPM8Gti6J/Qe3Dx6npTzOz4dz8wuLS8srqudW5YdhiWmjTTsGi4ApbjjuB7cwgyFTgRXr9vXy/+InGcq3O3CjDjoSB4n3OwPnU5W27e3xLv9Kz7nI9akZV0LcQT6BOJnHSXQm2k55muUTlmABrr+Ioc50CjONM4DhMcosZsGsY4JVHBRJtp6g2HtMNn+nRvjb+KEer7POKAqSV4IYN6qGU2IrsSKYNmsrqojPlG5Wql7Ncf79TcJXlDhV7GtXPBXWalgbQHjfInBjRMPmBfnODR77FcYYGnDZbRQJmIOFm7H8yoEmDlvyelKv/Us/hhp8AzHBvA2VDMMCc9z/0Bsev7XwL5zvNeLe5c/qlfvBtYvUMWSPrZJPEZI8ckENyQlqEEUl+kd/kT/A3uAvug4cnaS2Y1HwmL6I2/QgMbbAc</latexit>

observable

hidden

FIG. 1. Illustrative derivation trees for (a) simple English sentence, and (b) the RNA secondary structure (after [6]). The latter is a derivation of the sequence

“gacuaagcugaguc” and shows its folded structure. Terminal symbols are encircled.

CFGs as Binary Trees

CFG with Chomsky normal form

A set of rules R<latexit sha1_base64="yerjRtDf53B0zoqrEFGGi3EhqwQ=">AAACTXicfZFNaxsxEIa1zpe7TdIkPfYiagylOGY3DSRHk+bQS6lT6iTUa8KsPHZE9LFI2hCz+F/0mv6lnvNDcisl2rUhdQ0dEDwavaMZvUozwa2LooegtrK6tr5RfxG+3NzafrWzu3dudW4Y9pgW2lymYFFwhT3HncDLzCDIVOBFevOxPL+4RWO5Vt/cJMOBhLHiI87A+dT3RIK7ZiDo16udRtSOqqDLEM+hQebRvdoN9pOhZrlE5ZgAa/txlLlBAcZxJnAaJrnFDNgNjLHvUYFEOyiqkae06TNDOtLGL+Volf27ogBpy9la1EMpsRXZiUxbNJXVRmfKX1SqFnu50fGg4CrLHSo2azXKBXWalg7QITfInJjQMDlFP7nBz/6KLxkacNq8LxIwYwl3U/+SMU1atOT/Sbl6lnoOm74DMMO9DZRdgwHm/AeE3uD4XzuX4fygHX9oH5wdNjonc6vr5A15S96RmByRDvlEuqRHGFHkB7knP4NfwWPwO/gzk9aCec1rshC1jSfNZ7D/</latexit>

An alphabeta X<latexit sha1_base64="HsV876l9nQ2KUe0amBP7jKJ/mOQ=">AAACTXicfZFLSyQxEMfTo+uj3fV59BIcBJHZoVuF3aOoBy+igqOD9iDVmZoxmEeTpJcdmvkWXvUrefaDeBMx3Q74AgsCv1T+lar8k2aCWxdFD0FtbPzHxOTUdDjz89fs3PzC4qnVuWHYYlpo007BouAKW447ge3MIMhU4Fl6vVuen/1DY7lWJ26QYUdCX/EeZ+B86jyR4K4YCNq+nK9HzagK+hXiEdTJKI4uF4LfSVezXKJyTIC1F3GUuU4BxnEmcBgmucUM2DX08cKjAom2U1QjD+mqz3RpTxu/lKNV9n1FAdKWszWoh1JiK7IDmTZoKquNzpS/qFR97OV6fzsFV1nuULHXVr1cUKdp6QDtcoPMiQENkz30kxs88FccZmjAabNeJGD6Ev4P/Uv6NGnQkr+TcvUm9Ryu+g7ADPc2UHYFBpjzHxB6g+PPdn6F041mvNncON6qb++MrJ4iy2SFrJGY/CHbZJ8ckRZhRJEbckvugvvgMXgKnl+ltWBUs0Q+RG3yBdinsQU=</latexit> a ! bc

<latexit sha1_base64="Uf4idaQBMDwNdbKP0PVHpPChkmg=">AAACVHicfZHPahRBEMZ7JkbjGJONHr00LoEQNstMFPQY1IOXkAhuEsgsS01v7WyT/jN016jLsE/iVV9J8F08pGeyEGPAgoZfV39V1f11USnpKU1/R/Hag/WHjzYeJ082n25t93aenXlbO4EjYZV1FwV4VNLgiCQpvKgcgi4UnhdX79vz8y/ovLTmMy0qHGsojZxJARRSk9425E6WcwLn7FdeiEmvnw7TLvh9yFbQZ6s4nexEB/nUilqjIaHA+8ssrWjcgCMpFC6TvPZYgbiCEi8DGtDox0138yXfDZkpn1kXliHeZf+uaEB7DTQf8ACtxHfkF7oY8EJ3G1uZ0KhV3Z1Fs7fjRpqqJjTiZtSsVpwsb43gU+lQkFrwJP+A4eYOj0OLkwodkHX7TQ6u1PBtGV5S8nzAW/6fVJpbaeBkN0wA4WSwgYs5OBAU/iEJBmf/2nkfzg6H2avh4afX/aN3K6s32Av2ku2xjL1hR+wjO2UjJljNvrMf7Gf0K/oTr8XrN9I4WtU8Z3ci3roGHe+xpw==</latexit>

a ! A<latexit sha1_base64="V1VacZonqQOc6d6Zdqy4r0thVOo=">AAACUXicfZFNb9QwEIadUKCkQL+OXKyuKiG0rJKCBMcWOHCpaCW2rdQs1cQ7m7Xqj8ie0K6i/R9cy1/ixE/hViddiZZKjGTp8fidGft1USnpKU1/R/GDpYePHi8/SVaePnu+ura+ceRt7QQOhVXWnRTgUUmDQ5Kk8KRyCLpQeFycf2zPj7+j89KarzSrcKShNHIiBVBIfYPcyXJK4Jy94Htna710kHbB70O2gB5bxMHZevQ6H1tRazQkFHh/mqUVjRpwJIXCeZLXHisQ51DiaUADGv2o6a4959shM+YT68IyxLvs7YoGtNdA0z4P0Ep8R36miz4vdLexlQmNWtXdWTR5P2qkqWpCI25GTWrFyfLWBT6WDgWpGU/yTxhu7nA/tPhSoQOy7lWTgys1XM7DS0qe93nL/5NK81caONkOE0A4GWzgYgoOBIVPSILB2b923oejnUH2ZrBz+La3+2Fh9TJ7wbbYS5axd2yXfWYHbMgEc+wHu2I/o1/Rn5jF8Y00jhY1m+xOxCvXeJaw6A==</latexit>

n

<latexit sha1_base64="73njsyGqCfd3LnWI/1YR5zGk0dw=">AAACYXicfVFNSxxBEO0dTdQxias56qFxEUQ2y4wKepTEQy6igquCsyw1vbWzjf0xdPeIy7CX/Jpck3+Ts38kPeOAX2BBw6vqV6+6Xqe54NZF0b9WMDf/4ePC4lK4/Onzl5X26tql1YVh2GdaaHOdgkXBFfYddwKvc4MgU4FX6e2P6v7qDo3lWl24aY4DCZniY87A+dKwvZGkmHFVMq9hZ2GShAmqUZMO252oF9VB34K4AR3SxNlwtfUtGWlWSFSOCbD2Jo5yNyjBOM4Eev3CYg7sFjK88VCBRDso6zVmdMtXRnSsjT/K0br6vKMEaSW4SZd6UFFsjexUpl2ayjrRufJCFevlLDc+HJRc5YVDxR5HjQtBnaaVK3TEDTInpjRMjtG/3OCJlzjN0YDTZqdMwGQS7md+k4wmXVrh96hcPVE9Drf8BGCGexsom4AB5vynhN7g+LWdb8Hlbi/e6+2e73eOvjdWL5J1skm2SUwOyBH5Sc5InzDyi/wmf8jf1kOwFLSDtUdq0Gp6vpIXEaz/B3FYtg0=</latexit>

a, b, c 2 XH<latexit sha1_base64="ehyqJwf1+IhMIAgGDYCPe1XhzKc=">AAACW3icfZFNSyQxEIYz7cdqr7rjil68hB0EWdqhW4Xdo6gHL6LCjg7Yw1CdqRmD+WiS9LJD7/wZr/qHPPhfTLcD6gpbEHhSeZOqepPlglsXx4+NYGZ2bv7TwmL4eWl55Utz9eul1YVh2GFaaNPNwKLgCjuOO4Hd3CDITOBVdntUnV/9RmO5Vr/cOMeehJHiQ87A+VS/uQFRFjGackVTCe6GgaDd/km/2YrbcR30IyRTaJFpnPdXGzvpQLNConJMgLXXSZy7XgnGcSZwEqaFxRzYLYzw2qMCibZX1gNM6JbPDOhQG7+Uo3X27Y0SpK26i6iHSmJrsmOZRTST9Ubnyj9Uqd7XcsOfvZKrvHCo2EupYSGo07Tygw64QebEmIbpMfrODZ76J85yNOC0+V6mYEYS/kz8JCOaRrTi/0m5epV6Drd8BWCGexsouwEDzPnvCL3Byb92foTL3Xay19692G8dHE6tXiCb5BvZJgn5QQ7ICTknHcLIX3JH7slD4ymYCcJg6UUaNKZ31si7CNafAanxskk=</latexit>

A 2 XO<latexit sha1_base64="CymhvD8oHgLXFdaVJIpy75go3Gw=">AAACV3icfZFdSyMxFIYzs65bZz9s9dKbYBGWpVtmdEEv1fXCG1FhqwWnlDPpaRvMx5BkxDL0r+yt+5f8NZoZC37BHgg8OXlPzsmbLBfcuji+D8IPSx+XPzVWos9fvn5bbbbWLqwuDMMe00KbfgYWBVfYc9wJ7OcGQWYCL7Pr39X55Q0ay7X642Y5DiRMFB9zBs6nhs21A5pyRVMJbspA0P7wdNhsx924DvoekgW0ySLOhq3gZzrSrJCoHBNg7VUS525QgnGcCZxHaWExB3YNE7zyqECiHZT18HO65TMjOtbGL+VonX1ZUYK01XQd6qGS2JrsTGYdmsl6o3PlL6pUr3u58d6g5CovHCr21GpcCOo0rbygI26QOTGjUXqEfnKDJ/6K0xwNOG1+lCmYiYTbuX/JhKYdWvH/pFw9Sz1HW74DMMO9DZRNwQBz/isib3Dy1s73cLHdTXa62+e/2vuHC6sbZINsku8kIbtknxyTM9IjjNySv+SO/Avug4dwOWw8ScNgUbNOXkXYegTbSbHq</latexit>

T<latexit sha1_base64="zWgln8AnNj5hKs+8HTb3VDPXnws=">AAACTXicfZFLSyNBEMd74voaH+vj6KXZEFiWGGaygh5FPXhZVDAqOkFqOpXYpB9Dd48YhnwLr7tfybMfxNuybM8Y2NWABQ2/rv5XV/W/00xw66LoOajNfJqdm19YDJeWV1Y/r61vXFidG4YdpoU2VylYFFxhx3En8CozCDIVeJkOD8vzy3s0lmt17kYZdiUMFO9zBs6nrhMJ7o6BoOe3a/WoFVVBpyGeQJ1M4vR2PdhOeprlEpVjAqy9iaPMdQswjjOB4zDJLWbAhjDAG48KJNpuUY08pg2f6dG+Nn4pR6vs/xUFSFvO1qQeSomtyI5k2qSprDY6U/6iUvW2l+vvdQuustyhYq+t+rmgTtPSAdrjBpkTIxomR+gnN/jDX3GSoQGnzbciATOQ8DD2LxnQpElL/kjK1T+p57DhOwAz3NtA2R0YYM5/QOgNjt/bOQ0X7Vb8vdU+26nvH0ysXiBb5Av5SmKyS/bJMTklHcKIIo/kJ/kVPAUvwe/gz6u0FkxqNsmbqM3/BdEnsQE=</latexit>

Topology Tree

⌦T<latexit sha1_base64="y23BZItOEmsRye8unrtxI0XKjAw=">AAACS3icfZDdShtBFMdnY9V021o/LnszNAilpGHXCnoZ1AtvRAtGA24IZycn6+B8LDOzYljyEt7qK/kAPod34kVn14BaoQcGfnPmfz7mn+aCWxdF90Fj7sP8wmLzY/jp85elr8srqydWF4Zhj2mhTT8Fi4Ir7DnuBPZzgyBTgafpxW71fnqJxnKtjt0kx4GETPExZ+B8qp8cSsxgeDxcbkWdqA76HuIZtMgsjoYrwa9kpFkhUTkmwNqzOMrdoATjOBM4DZPCYg7sAjI886hAoh2U9cJTuu4zIzrWxh/laJ19XVGCtBLceZt6qCS2JjuRaZumsr7oXPlGlertLDfeHpRc5YVDxZ5HjQtBnabV/+mIG2ROTGiY7KHf3OCBb3GYowGnzc8yAZNJuJr6n2Q0adOK/yfl6kXqOVz3E4AZ7m2g7BwMMOftD73B8b92voeTjU78u7PxZ7PV3ZlZ3STfyHfyg8Rki3TJPjkiPcKIINfkhtwGd8FD8Bg8PUsbwaxmjbyJxvxfMXGwOw==</latexit>

Internal factors

|⌦T | = `T � 1<latexit sha1_base64="EvRR2RA9vVpbXaqCaN7HLV4fXME=">AAACWnicfZFdSxwxFIazY211tO360aveBBdByrrMqFBvBKm96E3Rwq4KzjKcyZ4dg/kYkkxxGffHeGt/keCPaWZcqB/QA4EnJ+85J3mTFYJbF0X3rWDuzfzbdwuL4dLy+w8f2yurp1aXhuGAaaHNeQYWBVc4cNwJPC8MgswEnmVXR/X52W80lmvVd5MChxJyxcecgfOptP3pJjmWmEPav6EHNEEh0v52nLY7US9qgr6GeAYdMouTdKW1nYw0KyUqxwRYexFHhRtWYBxnAqdhUlosgF1BjhceFUi0w6q5/5Ru+syIjrXxSznaZJ9WVCCtBHfZpR5qiW3ITmTWpZlsNrpQvlGtej7LjfeHFVdF6VCxx1HjUlCnaW0HHXGDzIkJDZPv6G9u8KdvcVygAafNlyoBk0u4nvqX5DTp0pr/J+Xqn9RzuOknADPc20DZJRhgzv9G6A2OX9r5Gk53evFub+fXXufw28zqBfKZbJAtEpOv5JD8ICdkQBipyC25I39aD0EQLAZLj9KgNatZI88iWP8LLmGyEA==</latexit>

#leaves

CFG with Chomsky normal form

A set of rules R<latexit sha1_base64="yerjRtDf53B0zoqrEFGGi3EhqwQ=">AAACTXicfZFNaxsxEIa1zpe7TdIkPfYiagylOGY3DSRHk+bQS6lT6iTUa8KsPHZE9LFI2hCz+F/0mv6lnvNDcisl2rUhdQ0dEDwavaMZvUozwa2LooegtrK6tr5RfxG+3NzafrWzu3dudW4Y9pgW2lymYFFwhT3HncDLzCDIVOBFevOxPL+4RWO5Vt/cJMOBhLHiI87A+dT3RIK7ZiDo16udRtSOqqDLEM+hQebRvdoN9pOhZrlE5ZgAa/txlLlBAcZxJnAaJrnFDNgNjLHvUYFEOyiqkae06TNDOtLGL+Volf27ogBpy9la1EMpsRXZiUxbNJXVRmfKX1SqFnu50fGg4CrLHSo2azXKBXWalg7QITfInJjQMDlFP7nBz/6KLxkacNq8LxIwYwl3U/+SMU1atOT/Sbl6lnoOm74DMMO9DZRdgwHm/AeE3uD4XzuX4fygHX9oH5wdNjonc6vr5A15S96RmByRDvlEuqRHGFHkB7knP4NfwWPwO/gzk9aCec1rshC1jSfNZ7D/</latexit>

An alphabeta X<latexit sha1_base64="HsV876l9nQ2KUe0amBP7jKJ/mOQ=">AAACTXicfZFLSyQxEMfTo+uj3fV59BIcBJHZoVuF3aOoBy+igqOD9iDVmZoxmEeTpJcdmvkWXvUrefaDeBMx3Q74AgsCv1T+lar8k2aCWxdFD0FtbPzHxOTUdDjz89fs3PzC4qnVuWHYYlpo007BouAKW447ge3MIMhU4Fl6vVuen/1DY7lWJ26QYUdCX/EeZ+B86jyR4K4YCNq+nK9HzagK+hXiEdTJKI4uF4LfSVezXKJyTIC1F3GUuU4BxnEmcBgmucUM2DX08cKjAom2U1QjD+mqz3RpTxu/lKNV9n1FAdKWszWoh1JiK7IDmTZoKquNzpS/qFR97OV6fzsFV1nuULHXVr1cUKdp6QDtcoPMiQENkz30kxs88FccZmjAabNeJGD6Ev4P/Uv6NGnQkr+TcvUm9Ryu+g7ADPc2UHYFBpjzHxB6g+PPdn6F041mvNncON6qb++MrJ4iy2SFrJGY/CHbZJ8ckRZhRJEbckvugvvgMXgKnl+ltWBUs0Q+RG3yBdinsQU=</latexit> a ! bc

<latexit sha1_base64="Uf4idaQBMDwNdbKP0PVHpPChkmg=">AAACVHicfZHPahRBEMZ7JkbjGJONHr00LoEQNstMFPQY1IOXkAhuEsgsS01v7WyT/jN016jLsE/iVV9J8F08pGeyEGPAgoZfV39V1f11USnpKU1/R/Hag/WHjzYeJ082n25t93aenXlbO4EjYZV1FwV4VNLgiCQpvKgcgi4UnhdX79vz8y/ovLTmMy0qHGsojZxJARRSk9425E6WcwLn7FdeiEmvnw7TLvh9yFbQZ6s4nexEB/nUilqjIaHA+8ssrWjcgCMpFC6TvPZYgbiCEi8DGtDox0138yXfDZkpn1kXliHeZf+uaEB7DTQf8ACtxHfkF7oY8EJ3G1uZ0KhV3Z1Fs7fjRpqqJjTiZtSsVpwsb43gU+lQkFrwJP+A4eYOj0OLkwodkHX7TQ6u1PBtGV5S8nzAW/6fVJpbaeBkN0wA4WSwgYs5OBAU/iEJBmf/2nkfzg6H2avh4afX/aN3K6s32Av2ku2xjL1hR+wjO2UjJljNvrMf7Gf0K/oTr8XrN9I4WtU8Z3ci3roGHe+xpw==</latexit>

a ! A<latexit sha1_base64="V1VacZonqQOc6d6Zdqy4r0thVOo=">AAACUXicfZFNb9QwEIadUKCkQL+OXKyuKiG0rJKCBMcWOHCpaCW2rdQs1cQ7m7Xqj8ie0K6i/R9cy1/ixE/hViddiZZKjGTp8fidGft1USnpKU1/R/GDpYePHi8/SVaePnu+ura+ceRt7QQOhVXWnRTgUUmDQ5Kk8KRyCLpQeFycf2zPj7+j89KarzSrcKShNHIiBVBIfYPcyXJK4Jy94Htna710kHbB70O2gB5bxMHZevQ6H1tRazQkFHh/mqUVjRpwJIXCeZLXHisQ51DiaUADGv2o6a4959shM+YT68IyxLvs7YoGtNdA0z4P0Ep8R36miz4vdLexlQmNWtXdWTR5P2qkqWpCI25GTWrFyfLWBT6WDgWpGU/yTxhu7nA/tPhSoQOy7lWTgys1XM7DS0qe93nL/5NK81caONkOE0A4GWzgYgoOBIVPSILB2b923oejnUH2ZrBz+La3+2Fh9TJ7wbbYS5axd2yXfWYHbMgEc+wHu2I/o1/Rn5jF8Y00jhY1m+xOxCvXeJaw6A==</latexit>

n

<latexit sha1_base64="73njsyGqCfd3LnWI/1YR5zGk0dw=">AAACYXicfVFNSxxBEO0dTdQxias56qFxEUQ2y4wKepTEQy6igquCsyw1vbWzjf0xdPeIy7CX/Jpck3+Ts38kPeOAX2BBw6vqV6+6Xqe54NZF0b9WMDf/4ePC4lK4/Onzl5X26tql1YVh2GdaaHOdgkXBFfYddwKvc4MgU4FX6e2P6v7qDo3lWl24aY4DCZniY87A+dKwvZGkmHFVMq9hZ2GShAmqUZMO252oF9VB34K4AR3SxNlwtfUtGWlWSFSOCbD2Jo5yNyjBOM4Eev3CYg7sFjK88VCBRDso6zVmdMtXRnSsjT/K0br6vKMEaSW4SZd6UFFsjexUpl2ayjrRufJCFevlLDc+HJRc5YVDxR5HjQtBnaaVK3TEDTInpjRMjtG/3OCJlzjN0YDTZqdMwGQS7md+k4wmXVrh96hcPVE9Drf8BGCGexsom4AB5vynhN7g+LWdb8Hlbi/e6+2e73eOvjdWL5J1skm2SUwOyBH5Sc5InzDyi/wmf8jf1kOwFLSDtUdq0Gp6vpIXEaz/B3FYtg0=</latexit>

a, b, c 2 XH<latexit sha1_base64="ehyqJwf1+IhMIAgGDYCPe1XhzKc=">AAACW3icfZFNSyQxEIYz7cdqr7rjil68hB0EWdqhW4Xdo6gHL6LCjg7Yw1CdqRmD+WiS9LJD7/wZr/qHPPhfTLcD6gpbEHhSeZOqepPlglsXx4+NYGZ2bv7TwmL4eWl55Utz9eul1YVh2GFaaNPNwKLgCjuOO4Hd3CDITOBVdntUnV/9RmO5Vr/cOMeehJHiQ87A+VS/uQFRFjGackVTCe6GgaDd/km/2YrbcR30IyRTaJFpnPdXGzvpQLNConJMgLXXSZy7XgnGcSZwEqaFxRzYLYzw2qMCibZX1gNM6JbPDOhQG7+Uo3X27Y0SpK26i6iHSmJrsmOZRTST9Ubnyj9Uqd7XcsOfvZKrvHCo2EupYSGo07Tygw64QebEmIbpMfrODZ76J85yNOC0+V6mYEYS/kz8JCOaRrTi/0m5epV6Drd8BWCGexsouwEDzPnvCL3Byb92foTL3Xay19692G8dHE6tXiCb5BvZJgn5QQ7ICTknHcLIX3JH7slD4ymYCcJg6UUaNKZ31si7CNafAanxskk=</latexit>

A 2 XO<latexit sha1_base64="CymhvD8oHgLXFdaVJIpy75go3Gw=">AAACV3icfZFdSyMxFIYzs65bZz9s9dKbYBGWpVtmdEEv1fXCG1FhqwWnlDPpaRvMx5BkxDL0r+yt+5f8NZoZC37BHgg8OXlPzsmbLBfcuji+D8IPSx+XPzVWos9fvn5bbbbWLqwuDMMe00KbfgYWBVfYc9wJ7OcGQWYCL7Pr39X55Q0ay7X642Y5DiRMFB9zBs6nhs21A5pyRVMJbspA0P7wdNhsx924DvoekgW0ySLOhq3gZzrSrJCoHBNg7VUS525QgnGcCZxHaWExB3YNE7zyqECiHZT18HO65TMjOtbGL+VonX1ZUYK01XQd6qGS2JrsTGYdmsl6o3PlL6pUr3u58d6g5CovHCr21GpcCOo0rbygI26QOTGjUXqEfnKDJ/6K0xwNOG1+lCmYiYTbuX/JhKYdWvH/pFw9Sz1HW74DMMO9DZRNwQBz/isib3Dy1s73cLHdTXa62+e/2vuHC6sbZINsku8kIbtknxyTM9IjjNySv+SO/Avug4dwOWw8ScNgUbNOXkXYegTbSbHq</latexit>

T<latexit sha1_base64="zWgln8AnNj5hKs+8HTb3VDPXnws=">AAACTXicfZFLSyNBEMd74voaH+vj6KXZEFiWGGaygh5FPXhZVDAqOkFqOpXYpB9Dd48YhnwLr7tfybMfxNuybM8Y2NWABQ2/rv5XV/W/00xw66LoOajNfJqdm19YDJeWV1Y/r61vXFidG4YdpoU2VylYFFxhx3En8CozCDIVeJkOD8vzy3s0lmt17kYZdiUMFO9zBs6nrhMJ7o6BoOe3a/WoFVVBpyGeQJ1M4vR2PdhOeprlEpVjAqy9iaPMdQswjjOB4zDJLWbAhjDAG48KJNpuUY08pg2f6dG+Nn4pR6vs/xUFSFvO1qQeSomtyI5k2qSprDY6U/6iUvW2l+vvdQuustyhYq+t+rmgTtPSAdrjBpkTIxomR+gnN/jDX3GSoQGnzbciATOQ8DD2LxnQpElL/kjK1T+p57DhOwAz3NtA2R0YYM5/QOgNjt/bOQ0X7Vb8vdU+26nvH0ysXiBb5Av5SmKyS/bJMTklHcKIIo/kJ/kVPAUvwe/gz6u0FkxqNsmbqM3/BdEnsQE=</latexit>

Topology Tree

⌦T<latexit sha1_base64="y23BZItOEmsRye8unrtxI0XKjAw=">AAACS3icfZDdShtBFMdnY9V021o/LnszNAilpGHXCnoZ1AtvRAtGA24IZycn6+B8LDOzYljyEt7qK/kAPod34kVn14BaoQcGfnPmfz7mn+aCWxdF90Fj7sP8wmLzY/jp85elr8srqydWF4Zhj2mhTT8Fi4Ir7DnuBPZzgyBTgafpxW71fnqJxnKtjt0kx4GETPExZ+B8qp8cSsxgeDxcbkWdqA76HuIZtMgsjoYrwa9kpFkhUTkmwNqzOMrdoATjOBM4DZPCYg7sAjI886hAoh2U9cJTuu4zIzrWxh/laJ19XVGCtBLceZt6qCS2JjuRaZumsr7oXPlGlertLDfeHpRc5YVDxZ5HjQtBnabV/+mIG2ROTGiY7KHf3OCBb3GYowGnzc8yAZNJuJr6n2Q0adOK/yfl6kXqOVz3E4AZ7m2g7BwMMOftD73B8b92voeTjU78u7PxZ7PV3ZlZ3STfyHfyg8Rki3TJPjkiPcKIINfkhtwGd8FD8Bg8PUsbwaxmjbyJxvxfMXGwOw==</latexit>

Internal factors

|⌦T | = `T � 1<latexit sha1_base64="EvRR2RA9vVpbXaqCaN7HLV4fXME=">AAACWnicfZFdSxwxFIazY211tO360aveBBdByrrMqFBvBKm96E3Rwq4KzjKcyZ4dg/kYkkxxGffHeGt/keCPaWZcqB/QA4EnJ+85J3mTFYJbF0X3rWDuzfzbdwuL4dLy+w8f2yurp1aXhuGAaaHNeQYWBVc4cNwJPC8MgswEnmVXR/X52W80lmvVd5MChxJyxcecgfOptP3pJjmWmEPav6EHNEEh0v52nLY7US9qgr6GeAYdMouTdKW1nYw0KyUqxwRYexFHhRtWYBxnAqdhUlosgF1BjhceFUi0w6q5/5Ru+syIjrXxSznaZJ9WVCCtBHfZpR5qiW3ITmTWpZlsNrpQvlGtej7LjfeHFVdF6VCxx1HjUlCnaW0HHXGDzIkJDZPv6G9u8KdvcVygAafNlyoBk0u4nvqX5DTp0pr/J+Xqn9RzuOknADPc20DZJRhgzv9G6A2OX9r5Gk53evFub+fXXufw28zqBfKZbJAtEpOv5JD8ICdkQBipyC25I39aD0EQLAZLj9KgNatZI88iWP8LLmGyEA==</latexit>

Boundary factors @⌦T<latexit sha1_base64="XQUncPf6A9PmtYVJ65nYvk0ndBw=">AAACVnicfZFdSxwxFIYzU606/dptL70JLkIp22XGFtpLab3wRlRwVXCW5Uz27BjMF0mm7TLsT+lt/Uv1z4iZccEv8EDgycl7zkneFEZw59P0KopfLC2/XFldS169fvP2Xaf7/tjpyjIcMi20PS3AoeAKh557gafGIshC4Elx8bM5P/mF1nGtjvzM4EhCqfiUM/AhNe50cwPWcxA035dYwvho3Omlg7QN+hSyBfTIIg7G3ehzPtGskqg8E+DcWZYaP6qbvkzgPMkrhwbYBZR4FlCBRDeq27vP6WbITOhU27CUp232fkUN0knw530aoJG4ltxMFn1ayHajjQqNGtXDWX76fVRzZSqPit2OmlaCek0bK+iEW2RezGiS72C4ucW90GLfoAWv7ac6B1tK+DMPLylp3qcNPyfl6k4aONkME4BZHmyg7BwsMB9+IgkGZ4/tfArHW4Psy2Dr8Gtv+8fC6lWyTjbIR5KRb2Sb7JIDMiSM/CZ/yT9yGf2PruPleOVWGkeLmg/kQcSdG1AdsjE=</latexit>

|@⌦T | = `T<latexit sha1_base64="e8b0PMT2pFlb/zJQFQe8gwK/0F0=">AAACYXicfZFNSxxBEIZ7Jx/qaJJVj3posggSNsuMEeIlIEkOuYgGdlVwlqGmt3Zs7C+6e4LLuJf8Gq/Jv8k5fyQ940JiBAsanq5+q6r77cII7nyS/OpET54+e760vBKvrr14+aq7vnHqdGUZjpgW2p4X4FBwhSPPvcBzYxFkIfCsuPrUnJ99Q+u4VkM/MziWUCo+5Qx8SOXd7ZvMgPUcBM2OJZaQD2/oB5qhEPkw7/aSQdIGfQjpAnpkESf5eudtNtGskqg8E+DcRZoYP66bCUzgPM4qhwbYFZR4EVCBRDeu22fM6U7ITOhU27CUp23234oapJPgL/s0QCNxLbmZLPq0kO1GGxUaNar7s/z0YFxzZSqPit2NmlaCek0bV+iEW2RezGicfcZwc4tHocWxQQte2zd1BraUcD0PLylp1qcNPybl6q80cLwTJgCzPNhA2SVYYD58ShwMTv+38yGc7g3Sd4O9r/u9w48Lq5fJFnlNdklK3pND8oWckBFh5Du5JT/Iz87vaCXqRht30qizqNkk9yLa+gMtDbVh</latexit>

CFG with Chomsky normal form

A set of rules R<latexit sha1_base64="yerjRtDf53B0zoqrEFGGi3EhqwQ=">AAACTXicfZFNaxsxEIa1zpe7TdIkPfYiagylOGY3DSRHk+bQS6lT6iTUa8KsPHZE9LFI2hCz+F/0mv6lnvNDcisl2rUhdQ0dEDwavaMZvUozwa2LooegtrK6tr5RfxG+3NzafrWzu3dudW4Y9pgW2lymYFFwhT3HncDLzCDIVOBFevOxPL+4RWO5Vt/cJMOBhLHiI87A+dT3RIK7ZiDo16udRtSOqqDLEM+hQebRvdoN9pOhZrlE5ZgAa/txlLlBAcZxJnAaJrnFDNgNjLHvUYFEOyiqkae06TNDOtLGL+Volf27ogBpy9la1EMpsRXZiUxbNJXVRmfKX1SqFnu50fGg4CrLHSo2azXKBXWalg7QITfInJjQMDlFP7nBz/6KLxkacNq8LxIwYwl3U/+SMU1atOT/Sbl6lnoOm74DMMO9DZRdgwHm/AeE3uD4XzuX4fygHX9oH5wdNjonc6vr5A15S96RmByRDvlEuqRHGFHkB7knP4NfwWPwO/gzk9aCec1rshC1jSfNZ7D/</latexit>

An alphabeta X<latexit sha1_base64="HsV876l9nQ2KUe0amBP7jKJ/mOQ=">AAACTXicfZFLSyQxEMfTo+uj3fV59BIcBJHZoVuF3aOoBy+igqOD9iDVmZoxmEeTpJcdmvkWXvUrefaDeBMx3Q74AgsCv1T+lar8k2aCWxdFD0FtbPzHxOTUdDjz89fs3PzC4qnVuWHYYlpo007BouAKW447ge3MIMhU4Fl6vVuen/1DY7lWJ26QYUdCX/EeZ+B86jyR4K4YCNq+nK9HzagK+hXiEdTJKI4uF4LfSVezXKJyTIC1F3GUuU4BxnEmcBgmucUM2DX08cKjAom2U1QjD+mqz3RpTxu/lKNV9n1FAdKWszWoh1JiK7IDmTZoKquNzpS/qFR97OV6fzsFV1nuULHXVr1cUKdp6QDtcoPMiQENkz30kxs88FccZmjAabNeJGD6Ev4P/Uv6NGnQkr+TcvUm9Ryu+g7ADPc2UHYFBpjzHxB6g+PPdn6F041mvNncON6qb++MrJ4iy2SFrJGY/CHbZJ8ckRZhRJEbckvugvvgMXgKnl+ltWBUs0Q+RG3yBdinsQU=</latexit> a ! bc

<latexit sha1_base64="Uf4idaQBMDwNdbKP0PVHpPChkmg=">AAACVHicfZHPahRBEMZ7JkbjGJONHr00LoEQNstMFPQY1IOXkAhuEsgsS01v7WyT/jN016jLsE/iVV9J8F08pGeyEGPAgoZfV39V1f11USnpKU1/R/Hag/WHjzYeJ082n25t93aenXlbO4EjYZV1FwV4VNLgiCQpvKgcgi4UnhdX79vz8y/ovLTmMy0qHGsojZxJARRSk9425E6WcwLn7FdeiEmvnw7TLvh9yFbQZ6s4nexEB/nUilqjIaHA+8ssrWjcgCMpFC6TvPZYgbiCEi8DGtDox0138yXfDZkpn1kXliHeZf+uaEB7DTQf8ACtxHfkF7oY8EJ3G1uZ0KhV3Z1Fs7fjRpqqJjTiZtSsVpwsb43gU+lQkFrwJP+A4eYOj0OLkwodkHX7TQ6u1PBtGV5S8nzAW/6fVJpbaeBkN0wA4WSwgYs5OBAU/iEJBmf/2nkfzg6H2avh4afX/aN3K6s32Av2ku2xjL1hR+wjO2UjJljNvrMf7Gf0K/oTr8XrN9I4WtU8Z3ci3roGHe+xpw==</latexit>

a ! A<latexit sha1_base64="V1VacZonqQOc6d6Zdqy4r0thVOo=">AAACUXicfZFNb9QwEIadUKCkQL+OXKyuKiG0rJKCBMcWOHCpaCW2rdQs1cQ7m7Xqj8ie0K6i/R9cy1/ixE/hViddiZZKjGTp8fidGft1USnpKU1/R/GDpYePHi8/SVaePnu+ura+ceRt7QQOhVXWnRTgUUmDQ5Kk8KRyCLpQeFycf2zPj7+j89KarzSrcKShNHIiBVBIfYPcyXJK4Jy94Htna710kHbB70O2gB5bxMHZevQ6H1tRazQkFHh/mqUVjRpwJIXCeZLXHisQ51DiaUADGv2o6a4959shM+YT68IyxLvs7YoGtNdA0z4P0Ep8R36miz4vdLexlQmNWtXdWTR5P2qkqWpCI25GTWrFyfLWBT6WDgWpGU/yTxhu7nA/tPhSoQOy7lWTgys1XM7DS0qe93nL/5NK81caONkOE0A4GWzgYgoOBIVPSILB2b923oejnUH2ZrBz+La3+2Fh9TJ7wbbYS5axd2yXfWYHbMgEc+wHu2I/o1/Rn5jF8Y00jhY1m+xOxCvXeJaw6A==</latexit>

n

<latexit sha1_base64="73njsyGqCfd3LnWI/1YR5zGk0dw=">AAACYXicfVFNSxxBEO0dTdQxias56qFxEUQ2y4wKepTEQy6igquCsyw1vbWzjf0xdPeIy7CX/Jpck3+Ts38kPeOAX2BBw6vqV6+6Xqe54NZF0b9WMDf/4ePC4lK4/Onzl5X26tql1YVh2GdaaHOdgkXBFfYddwKvc4MgU4FX6e2P6v7qDo3lWl24aY4DCZniY87A+dKwvZGkmHFVMq9hZ2GShAmqUZMO252oF9VB34K4AR3SxNlwtfUtGWlWSFSOCbD2Jo5yNyjBOM4Eev3CYg7sFjK88VCBRDso6zVmdMtXRnSsjT/K0br6vKMEaSW4SZd6UFFsjexUpl2ayjrRufJCFevlLDc+HJRc5YVDxR5HjQtBnaaVK3TEDTInpjRMjtG/3OCJlzjN0YDTZqdMwGQS7md+k4wmXVrh96hcPVE9Drf8BGCGexsom4AB5vynhN7g+LWdb8Hlbi/e6+2e73eOvjdWL5J1skm2SUwOyBH5Sc5InzDyi/wmf8jf1kOwFLSDtUdq0Gp6vpIXEaz/B3FYtg0=</latexit>

a, b, c 2 XH<latexit sha1_base64="ehyqJwf1+IhMIAgGDYCPe1XhzKc=">AAACW3icfZFNSyQxEIYz7cdqr7rjil68hB0EWdqhW4Xdo6gHL6LCjg7Yw1CdqRmD+WiS9LJD7/wZr/qHPPhfTLcD6gpbEHhSeZOqepPlglsXx4+NYGZ2bv7TwmL4eWl55Utz9eul1YVh2GFaaNPNwKLgCjuOO4Hd3CDITOBVdntUnV/9RmO5Vr/cOMeehJHiQ87A+VS/uQFRFjGackVTCe6GgaDd/km/2YrbcR30IyRTaJFpnPdXGzvpQLNConJMgLXXSZy7XgnGcSZwEqaFxRzYLYzw2qMCibZX1gNM6JbPDOhQG7+Uo3X27Y0SpK26i6iHSmJrsmOZRTST9Ubnyj9Uqd7XcsOfvZKrvHCo2EupYSGo07Tygw64QebEmIbpMfrODZ76J85yNOC0+V6mYEYS/kz8JCOaRrTi/0m5epV6Drd8BWCGexsouwEDzPnvCL3Byb92foTL3Xay19692G8dHE6tXiCb5BvZJgn5QQ7ICTknHcLIX3JH7slD4ymYCcJg6UUaNKZ31si7CNafAanxskk=</latexit>

A 2 XO<latexit sha1_base64="CymhvD8oHgLXFdaVJIpy75go3Gw=">AAACV3icfZFdSyMxFIYzs65bZz9s9dKbYBGWpVtmdEEv1fXCG1FhqwWnlDPpaRvMx5BkxDL0r+yt+5f8NZoZC37BHgg8OXlPzsmbLBfcuji+D8IPSx+XPzVWos9fvn5bbbbWLqwuDMMe00KbfgYWBVfYc9wJ7OcGQWYCL7Pr39X55Q0ay7X642Y5DiRMFB9zBs6nhs21A5pyRVMJbspA0P7wdNhsx924DvoekgW0ySLOhq3gZzrSrJCoHBNg7VUS525QgnGcCZxHaWExB3YNE7zyqECiHZT18HO65TMjOtbGL+VonX1ZUYK01XQd6qGS2JrsTGYdmsl6o3PlL6pUr3u58d6g5CovHCr21GpcCOo0rbygI26QOTGjUXqEfnKDJ/6K0xwNOG1+lCmYiYTbuX/JhKYdWvH/pFw9Sz1HW74DMMO9DZRNwQBz/isib3Dy1s73cLHdTXa62+e/2vuHC6sbZINsku8kIbtknxyTM9IjjNySv+SO/Avug4dwOWw8ScNgUbNOXkXYegTbSbHq</latexit>

T<latexit sha1_base64="zWgln8AnNj5hKs+8HTb3VDPXnws=">AAACTXicfZFLSyNBEMd74voaH+vj6KXZEFiWGGaygh5FPXhZVDAqOkFqOpXYpB9Dd48YhnwLr7tfybMfxNuybM8Y2NWABQ2/rv5XV/W/00xw66LoOajNfJqdm19YDJeWV1Y/r61vXFidG4YdpoU2VylYFFxhx3En8CozCDIVeJkOD8vzy3s0lmt17kYZdiUMFO9zBs6nrhMJ7o6BoOe3a/WoFVVBpyGeQJ1M4vR2PdhOeprlEpVjAqy9iaPMdQswjjOB4zDJLWbAhjDAG48KJNpuUY08pg2f6dG+Nn4pR6vs/xUFSFvO1qQeSomtyI5k2qSprDY6U/6iUvW2l+vvdQuustyhYq+t+rmgTtPSAdrjBpkTIxomR+gnN/jDX3GSoQGnzbciATOQ8DD2LxnQpElL/kjK1T+p57DhOwAz3NtA2R0YYM5/QOgNjt/bOQ0X7Vb8vdU+26nvH0ysXiBb5Av5SmKyS/bJMTklHcKIIo/kJ/kVPAUvwe/gz6u0FkxqNsmbqM3/BdEnsQE=</latexit>

Topology Tree

{�i}<latexit sha1_base64="RKTf62HlAD/JqMcBCnfaiGZfehI=">AAACT3icfZFdSxwxFIYz21p1als/Lr0JXYRStsuMFdpLUS+8KSq4KphlOZM9uwbzRZIpXYb9G721f6mX/hLvpJlxoFqhBwJPTt6cc/KmsFL4kGW3SefFy4VXi0vL6euVN2/fra6tn3lTOo4DbqRxFwV4lELjIIgg8cI6BFVIPC+u9+vz8+/ovDD6NMwsDhVMtZgIDiGmGKuYF1MFI8Hmo9Vu1s+aoM8hb6FL2jgerSWf2NjwUqEOXIL3l3lmw7ACFwSXOE9Z6dECv4YpXkbUoNAPq2boOd2KmTGdGBeXDrTJPr5RgfIKwlWPRqglviE/U0WPFqrZGKtjoVr1tFeYfB1WQtsyoOYPrSalpMHQ2gM6Fg55kDOasgOMkzv8FkscWXQQjPtYMXDRkx/z+JIpZT1a8/+kQv+VRk63YgfgTkQbKL8CBzzEL0ijwfm/dj6Hs+1+/rm/fbLT3d1rrV4im+Q9+UBy8oXskkNyTAaEE0t+khvyK/md3CX3nVbaSVrYIE+is/wHUtmxUQ==</latexit>

Nonterminal values

Terminal values

|{�i}| = 2`T � 1<latexit sha1_base64="h7VM4F9Ny7alqVZbDKDg4GA99k8=">AAACX3icfZHfShwxFMazY6t2WnWtV603oYsgsi4za6G9EaTtRW+KFlwVzDKcyZ5dg/kzJBlxGRd8mt7q6/Syb9LMuNBaoQcCv5x855zkS15I4XyS/GxFC8+eLy4tv4hfvlpZXWuvvz5xprQcB9xIY89ycCiFxoEXXuJZYRFULvE0v/xcn59eoXXC6GM/LXCoYKLFWHDwIZW1396wijkxUZAJNruh+7TPUMrseDfN2p2klzRBn0I6hw6Zx1G23tplI8NLhdpzCc6dp0nhhxVYL7jEWcxKhwXwS5jgeUANCt2wah4xo1shM6JjY8PSnjbZvysqUE6Bv+jSALXENeSmKu/SXDUbU+jQqFY9nuXHH4eV0EXpUfOHUeNSUm9o7QkdCYvcyymN2RcMN7f4LbQ4LNCCN3anYmCDQdez8JIJZV1a8/+kQv+RBo63wgTgVgQbKL8AC9yHL4mDwem/dj6Fk34v3ev1v7/vHHyaW71MNsk7sk1S8oEckK/kiAwIJ7fkB7kj961f0VK0GrUfpFFrXrNBHkX05jcV37Rh</latexit>

{oi}<latexit sha1_base64="jIxe0pm3Q+9PRbAESkmeiHU1iBw=">AAACSnicfZDdShtBFMdno7XpWr8vezM0CKXEsBsLehmsF96IFowR3BDOTk7ikPlYZmbFsOQhvK2v5Av0NXon3ji7BqoVPDDwmzP/8zH/NBPcuij6E9QWFj8sfax/Cpc/r6yurW9snludG4ZdpoU2FylYFFxh13En8CIzCDIV2EsnP8v33jUay7U6c9MM+xLGio84A+dTvaTQA57MBuuNqBVVQd9CPIcGmcfpYCPYSYaa5RKVYwKsvYyjzPULMI4zgbMwyS1mwCYwxkuPCiTaflHtO6PbPjOkI238UY5W2ZcVBUgrwV01qYdSYiuyU5k2aSqri86Ub1SqXs9yo/1+wVWWO1TsedQoF9RpWn6fDrlB5sSUhskh+s0NHvsWJxkacNp8LxIwYwk3M/+TMU2atOT3pFz9k3oOt/0EYIZ7Gyi7AgPMefdDb3D8v51v4bzdindb7V8/Gp2DudV18oV8Jd9ITPZIhxyRU9IljEzILflN7oL74G/wEDw+S2vBvGaLvIra4hPqY7Ag</latexit>

|{oi}| = `T<latexit sha1_base64="Mf7951XPgy09dXqap+yyKZtWARI=">AAACV3icfZHfShtBFMZnt1XTbdXEXvZmaBBKiWFXBb0RQtuL3pRaMCq4YTk7OUkG588yMyuGTV7FW/tKPk07uwZaK/TAwG/OfOecmW/yQnDr4vghCF+8XFvfaL2KXr/Z3Npud3bOrS4NwyHTQpvLHCwKrnDouBN4WRgEmQu8yK8/1+cXN2gs1+rMzQscSZgqPuEMnE9l7Z1FWumMp8sFPaEpCpGdZe1u3I+boM8hWUGXrOI06wR76VizUqJyTIC1V0lcuFEFxnEmcBmlpcUC2DVM8cqjAol2VDWXX9JdnxnTiTZ+KUeb7N8VFUgrwc161EMtsQ3Zucx7NJfNRhfKN6pVT2e5yfGo4qooHSr2OGpSCuo0rb2gY26QOTGnUfoF/c0NfvMtvhdowGnzsUrBTCXcLv1LpjTt0Zr/J+Xqj9RztOsnADPc20DZDAww578i8gYn/9r5HM73+8lBf//HYXfwaWV1i7wj78kHkpAjMiBfySkZEkZuyR25Jz+Dh+BXuB62HqVhsKp5S55E2PkNAeuygg==</latexit>

Weighted CFG with Chomsky normal form

A set of rules R<latexit sha1_base64="yerjRtDf53B0zoqrEFGGi3EhqwQ=">AAACTXicfZFNaxsxEIa1zpe7TdIkPfYiagylOGY3DSRHk+bQS6lT6iTUa8KsPHZE9LFI2hCz+F/0mv6lnvNDcisl2rUhdQ0dEDwavaMZvUozwa2LooegtrK6tr5RfxG+3NzafrWzu3dudW4Y9pgW2lymYFFwhT3HncDLzCDIVOBFevOxPL+4RWO5Vt/cJMOBhLHiI87A+dT3RIK7ZiDo16udRtSOqqDLEM+hQebRvdoN9pOhZrlE5ZgAa/txlLlBAcZxJnAaJrnFDNgNjLHvUYFEOyiqkae06TNDOtLGL+Volf27ogBpy9la1EMpsRXZiUxbNJXVRmfKX1SqFnu50fGg4CrLHSo2azXKBXWalg7QITfInJjQMDlFP7nBz/6KLxkacNq8LxIwYwl3U/+SMU1atOT/Sbl6lnoOm74DMMO9DZRdgwHm/AeE3uD4XzuX4fygHX9oH5wdNjonc6vr5A15S96RmByRDvlEuqRHGFHkB7knP4NfwWPwO/gzk9aCec1rshC1jSfNZ7D/</latexit>

An alphabeta X<latexit sha1_base64="HsV876l9nQ2KUe0amBP7jKJ/mOQ=">AAACTXicfZFLSyQxEMfTo+uj3fV59BIcBJHZoVuF3aOoBy+igqOD9iDVmZoxmEeTpJcdmvkWXvUrefaDeBMx3Q74AgsCv1T+lar8k2aCWxdFD0FtbPzHxOTUdDjz89fs3PzC4qnVuWHYYlpo007BouAKW447ge3MIMhU4Fl6vVuen/1DY7lWJ26QYUdCX/EeZ+B86jyR4K4YCNq+nK9HzagK+hXiEdTJKI4uF4LfSVezXKJyTIC1F3GUuU4BxnEmcBgmucUM2DX08cKjAom2U1QjD+mqz3RpTxu/lKNV9n1FAdKWszWoh1JiK7IDmTZoKquNzpS/qFR97OV6fzsFV1nuULHXVr1cUKdp6QDtcoPMiQENkz30kxs88FccZmjAabNeJGD6Ev4P/Uv6NGnQkr+TcvUm9Ryu+g7ADPc2UHYFBpjzHxB6g+PPdn6F041mvNncON6qb++MrJ4iy2SFrJGY/CHbZJ8ckRZhRJEbckvugvvgMXgKnl+ltWBUs0Q+RG3yBdinsQU=</latexit> a ! bc

<latexit sha1_base64="Uf4idaQBMDwNdbKP0PVHpPChkmg=">AAACVHicfZHPahRBEMZ7JkbjGJONHr00LoEQNstMFPQY1IOXkAhuEsgsS01v7WyT/jN016jLsE/iVV9J8F08pGeyEGPAgoZfV39V1f11USnpKU1/R/Hag/WHjzYeJ082n25t93aenXlbO4EjYZV1FwV4VNLgiCQpvKgcgi4UnhdX79vz8y/ovLTmMy0qHGsojZxJARRSk9425E6WcwLn7FdeiEmvnw7TLvh9yFbQZ6s4nexEB/nUilqjIaHA+8ssrWjcgCMpFC6TvPZYgbiCEi8DGtDox0138yXfDZkpn1kXliHeZf+uaEB7DTQf8ACtxHfkF7oY8EJ3G1uZ0KhV3Z1Fs7fjRpqqJjTiZtSsVpwsb43gU+lQkFrwJP+A4eYOj0OLkwodkHX7TQ6u1PBtGV5S8nzAW/6fVJpbaeBkN0wA4WSwgYs5OBAU/iEJBmf/2nkfzg6H2avh4afX/aN3K6s32Av2ku2xjL1hR+wjO2UjJljNvrMf7Gf0K/oTr8XrN9I4WtU8Z3ci3roGHe+xpw==</latexit>

a ! A<latexit sha1_base64="V1VacZonqQOc6d6Zdqy4r0thVOo=">AAACUXicfZFNb9QwEIadUKCkQL+OXKyuKiG0rJKCBMcWOHCpaCW2rdQs1cQ7m7Xqj8ie0K6i/R9cy1/ixE/hViddiZZKjGTp8fidGft1USnpKU1/R/GDpYePHi8/SVaePnu+ura+ceRt7QQOhVXWnRTgUUmDQ5Kk8KRyCLpQeFycf2zPj7+j89KarzSrcKShNHIiBVBIfYPcyXJK4Jy94Htna710kHbB70O2gB5bxMHZevQ6H1tRazQkFHh/mqUVjRpwJIXCeZLXHisQ51DiaUADGv2o6a4959shM+YT68IyxLvs7YoGtNdA0z4P0Ep8R36miz4vdLexlQmNWtXdWTR5P2qkqWpCI25GTWrFyfLWBT6WDgWpGU/yTxhu7nA/tPhSoQOy7lWTgys1XM7DS0qe93nL/5NK81caONkOE0A4GWzgYgoOBIVPSILB2b923oejnUH2ZrBz+La3+2Fh9TJ7wbbYS5axd2yXfWYHbMgEc+wHu2I/o1/Rn5jF8Y00jhY1m+xOxCvXeJaw6A==</latexit>

n

<latexit sha1_base64="73njsyGqCfd3LnWI/1YR5zGk0dw=">AAACYXicfVFNSxxBEO0dTdQxias56qFxEUQ2y4wKepTEQy6igquCsyw1vbWzjf0xdPeIy7CX/Jpck3+Ts38kPeOAX2BBw6vqV6+6Xqe54NZF0b9WMDf/4ePC4lK4/Onzl5X26tql1YVh2GdaaHOdgkXBFfYddwKvc4MgU4FX6e2P6v7qDo3lWl24aY4DCZniY87A+dKwvZGkmHFVMq9hZ2GShAmqUZMO252oF9VB34K4AR3SxNlwtfUtGWlWSFSOCbD2Jo5yNyjBOM4Eev3CYg7sFjK88VCBRDso6zVmdMtXRnSsjT/K0br6vKMEaSW4SZd6UFFsjexUpl2ayjrRufJCFevlLDc+HJRc5YVDxR5HjQtBnaaVK3TEDTInpjRMjtG/3OCJlzjN0YDTZqdMwGQS7md+k4wmXVrh96hcPVE9Drf8BGCGexsom4AB5vynhN7g+LWdb8Hlbi/e6+2e73eOvjdWL5J1skm2SUwOyBH5Sc5InzDyi/wmf8jf1kOwFLSDtUdq0Gp6vpIXEaz/B3FYtg0=</latexit>

a, b, c 2 XH<latexit sha1_base64="ehyqJwf1+IhMIAgGDYCPe1XhzKc=">AAACW3icfZFNSyQxEIYz7cdqr7rjil68hB0EWdqhW4Xdo6gHL6LCjg7Yw1CdqRmD+WiS9LJD7/wZr/qHPPhfTLcD6gpbEHhSeZOqepPlglsXx4+NYGZ2bv7TwmL4eWl55Utz9eul1YVh2GFaaNPNwKLgCjuOO4Hd3CDITOBVdntUnV/9RmO5Vr/cOMeehJHiQ87A+VS/uQFRFjGackVTCe6GgaDd/km/2YrbcR30IyRTaJFpnPdXGzvpQLNConJMgLXXSZy7XgnGcSZwEqaFxRzYLYzw2qMCibZX1gNM6JbPDOhQG7+Uo3X27Y0SpK26i6iHSmJrsmOZRTST9Ubnyj9Uqd7XcsOfvZKrvHCo2EupYSGo07Tygw64QebEmIbpMfrODZ76J85yNOC0+V6mYEYS/kz8JCOaRrTi/0m5epV6Drd8BWCGexsouwEDzPnvCL3Byb92foTL3Xay19692G8dHE6tXiCb5BvZJgn5QQ7ICTknHcLIX3JH7slD4ymYCcJg6UUaNKZ31si7CNafAanxskk=</latexit>

A 2 XO<latexit sha1_base64="CymhvD8oHgLXFdaVJIpy75go3Gw=">AAACV3icfZFdSyMxFIYzs65bZz9s9dKbYBGWpVtmdEEv1fXCG1FhqwWnlDPpaRvMx5BkxDL0r+yt+5f8NZoZC37BHgg8OXlPzsmbLBfcuji+D8IPSx+XPzVWos9fvn5bbbbWLqwuDMMe00KbfgYWBVfYc9wJ7OcGQWYCL7Pr39X55Q0ay7X642Y5DiRMFB9zBs6nhs21A5pyRVMJbspA0P7wdNhsx924DvoekgW0ySLOhq3gZzrSrJCoHBNg7VUS525QgnGcCZxHaWExB3YNE7zyqECiHZT18HO65TMjOtbGL+VonX1ZUYK01XQd6qGS2JrsTGYdmsl6o3PlL6pUr3u58d6g5CovHCr21GpcCOo0rbygI26QOTGjUXqEfnKDJ/6K0xwNOG1+lCmYiYTbuX/JhKYdWvH/pFw9Sz1HW74DMMO9DZRNwQBz/isib3Dy1s73cLHdTXa62+e/2vuHC6sbZINsku8kIbtknxyTM9IjjNySv+SO/Avug4dwOWw8ScNgUbNOXkXYegTbSbHq</latexit>

T<latexit sha1_base64="zWgln8AnNj5hKs+8HTb3VDPXnws=">AAACTXicfZFLSyNBEMd74voaH+vj6KXZEFiWGGaygh5FPXhZVDAqOkFqOpXYpB9Dd48YhnwLr7tfybMfxNuybM8Y2NWABQ2/rv5XV/W/00xw66LoOajNfJqdm19YDJeWV1Y/r61vXFidG4YdpoU2VylYFFxhx3En8CozCDIVeJkOD8vzy3s0lmt17kYZdiUMFO9zBs6nrhMJ7o6BoOe3a/WoFVVBpyGeQJ1M4vR2PdhOeprlEpVjAqy9iaPMdQswjjOB4zDJLWbAhjDAG48KJNpuUY08pg2f6dG+Nn4pR6vs/xUFSFvO1qQeSomtyI5k2qSprDY6U/6iUvW2l+vvdQuustyhYq+t+rmgTtPSAdrjBpkTIxomR+gnN/jDX3GSoQGnzbciATOQ8DD2LxnQpElL/kjK1T+p57DhOwAz3NtA2R0YYM5/QOgNjt/bOQ0X7Vb8vdU+26nvH0ysXiBb5Av5SmKyS/bJMTklHcKIIo/kJ/kVPAUvwe/gz6u0FkxqNsmbqM3/BdEnsQE=</latexit>

Topology Tree

Mabc<latexit sha1_base64="XCsL0FGvr8BOccsQqxx9Wjb2+Q0=">AAACSnicfZDLahsxFIY1btqm00uSZtmNiDGU4pqZJNAuQ9NFNiEJxLHBM5gz8hlHWJdB0pSawQ/RbfNKfYG8Rnalm2gmhlwMPSD4dPSfi/6sENy6KLoOWs/Wnr94uf4qfP3m7buNza33F1aXhmGfaaHNMAOLgivsO+4EDguDIDOBg2x2WL8PfqCxXKtzNy8wlTBVPOcMnE8NjscVZGwx3mxHvagJugrxEtpkGafjreBzMtGslKgcE2DtKI4Kl1ZgHGcCF2FSWiyAzWCKI48KJNq0avZd0I7PTGiujT/K0Sb7sKICaSW4yy71UEtsQ3Yusy7NZHPRhfKNatXjWS7/mlZcFaVDxe5G5aWgTtP6+3TCDTIn5jRMvqPf3OCxb3FSoAGnzacqATOV8HPhfzKlSZfW/D8pV/dSz2HHTwBmuLeBskswwJx3P/QGx0/tXIWL3V6819s9228ffFtavU4+kB3ykcTkCzkgR+SU9AkjM/KL/CZXwZ/gJvgb/LuTtoJlzTZ5FK21W7NMsAM=</latexit>

OaA<latexit sha1_base64="9F7ounoMcyA4M4yy0SVKxP9sH4g=">AAACSXicfZDdahNBFMdnU7Xt+pXUS28GQ0Akht1aaC9T9cIbSQSTFLohnJ2cTaeZj2VmtjQseQdv9ZV8gj5G74pXzm4DWgseGPjNmf/5mH+aC25dFF0Fja0HDx9t7+yGj588ffa82dobW10YhiOmhTYnKVgUXOHIcSfwJDcIMhU4SZcfqvfJBRrLtfrqVjlOJSwUzzgD51PjwayE4/Ws2Y56UR30PsQbaJNNDGet4G0y16yQqBwTYO1pHOVuWoJxnAlch0lhMQe2hAWeelQg0U7Let017fjMnGba+KMcrbN/V5QgrQR31qUeKomtya5k2qWprC86V75Rpbo7y2VH05KrvHCo2O2orBDUaVr9ns65QebEiobJR/SbG/zsWwxyNOC0eVMmYBYSLtf+JwuadGnF/5Ny9UfqOez4CcAM9zZQdgYGmPPmh97g+F8778N4vxe/6+1/OWj332+s3iEvySvymsTkkPTJJzIkI8LIOflGvpMfwc/gOrgJft1KG8Gm5gW5E42t35fPr3c=</latexit>

Nonterminal factor weights

Terminal factor weights

Mabc = P(a ! bc|a ! nonterminal)<latexit sha1_base64="ewJnIxk+zxfZRFemIAE01g3vF+Y=">AAACiXicfZDbahsxEIbl7Snd9OC0l70RNYG0uGY3LSQEAqHJRW9CXaiTQNaYkTxei+iwSNokZrMv1Kfpbfo01W4MrRvogODT6J8Zzc8KKZxPkttO9ODho8dP1p7G68+ev3jZ3Xh14kxpOY64kcaeMXAohcaRF17iWWERFJN4yi4Om/fTS7ROGP3dLwocK8i1mAkOPqQm3aPjSQWM13SfZgr8nDE63ILMinzuwVpzRRm/WblnHq99pY32aJXQIOt3k24vGSRt0PuQLqFHljGcbHQ+ZFPDS4XacwnOnadJ4ccVWC+4xDrOSocF8AvI8TygBoVuXLXr1nQzZKZ0Zmw42tM2+3dFBco1u/RpgEbiWnILxfqUqfZiCh0aNarVWX62O66ELkqPmt+NmpWSekMb9+hUWOReLmicHWH4ucXj0OJrgRa8se+rDGyu4LoOm+Q069OG/ycV+o80cLwZJgC3IthA+Rws8OCyi4PB6b923oeT7UH6cbD97VPv4PPS6jXyhrwlWyQlO+SAfCFDMiKc/CA/yS35Fa1HabQb7d1Jo86y5jVZiejwN6Qcxe8=</latexit>

X

bc

Mabc =X

A

OaA = 1<latexit sha1_base64="UbNLDY185K08qe62pcK2b1gyYQA=">AAACZnicfVFdSxwxFM1OtdXR2lWRPvQldRGkrMuMLdgXwY8+9EW04KrgLMOd7N01mI8hyZQuwzz7a/ra/pb+A3+GmdkFa4VeCDn35Nx7k5MsF9y6KPrTCl7Mzb98tbAYLi2/XnnTXl27sLowDPtMC22uMrAouMK+407gVW4QZCbwMrs9rs8vv6OxXKtzN8lxIGGs+IgzcJ5K2+8TW8i0zFh1kpbgt/0pcVid+vyw2o/TdifqRU3Q5yCegQ6ZxVm62tpJhpoVEpVjAqy9jqPcDUowjjOBVZgUFnNgtzDGaw8VSLSDsnlLRbc8M6QjbfxSjjbs3xUlSCvB3XSpB7XENshOZNalmWwSnSvfqFY9neVGnwclV3nhULHpqFEhqNO0toYOuUHmxISGyRf0Nzd44luc5mjAafOhTMCMJfyo/EvGNOnSGv9PytWj1ONwy08AZri3gbIbMMCc/5nQGxz/a+dzcLHbiz/2dr996hwczaxeIO/IJtkmMdkjB+QrOSN9wsgd+Ul+kd+t+2Al2AjeTqVBa1azTp5EQB8AO8q4RA==</latexit>

OaA = P(a ! A|a ! terminal)<latexit sha1_base64="22JdU58GFv0l+FfIr7Lci3Fb2Q8=">AAAChHicfZBNbxMxEIadBUpZvlI4crGIKhUUot0WBBegBQ5cUINE2krdKJp1Jhur/ljZE2i07M/h13CFA/8G7zYShEqMZOnx+J0Zz5uXSnpKkl+d6MrVaxvXN2/EN2/dvnO3u3XvyNuFEzgSVll3koNHJQ2OSJLCk9Ih6FzhcX72tnk//ozOS2s+0bLEsYbCyJkUQCE16b4+nFRwUPOXPNNA8zznwx3InCzmBM7ZL/zg69o1IzynitBpaUDVjybdXjJI2uCXIV1Bj61iONnqPMmmViw0GhIKvD9Nk5LGFTiSQmEdZwuPJYgzKPA0oAGNfly1m9Z8O2SmfGZdOIZ4m/27ogLtmz36PEAj8S35pc77PNftxZYmNGpU67No9mJcSVMuCI24GDVbKE6WN8bxqXQoSC15nL3D8HOHH0KLwxIdkHWPqwxcoeG8DpsUPOvzhv8nleaPNHC8HSaAcDLYwMUcHIjgso+Dwem/dl6Go91BujfY/fi0t/9mZfUme8Aesh2Wsudsn71nQzZign1j39kP9jPaiPrRXvTsQhp1VjX32VpEr34DBujDbA==</latexit>

Weighted CFG with Chomsky normal form

T<latexit sha1_base64="zWgln8AnNj5hKs+8HTb3VDPXnws=">AAACTXicfZFLSyNBEMd74voaH+vj6KXZEFiWGGaygh5FPXhZVDAqOkFqOpXYpB9Dd48YhnwLr7tfybMfxNuybM8Y2NWABQ2/rv5XV/W/00xw66LoOajNfJqdm19YDJeWV1Y/r61vXFidG4YdpoU2VylYFFxhx3En8CozCDIVeJkOD8vzy3s0lmt17kYZdiUMFO9zBs6nrhMJ7o6BoOe3a/WoFVVBpyGeQJ1M4vR2PdhOeprlEpVjAqy9iaPMdQswjjOB4zDJLWbAhjDAG48KJNpuUY08pg2f6dG+Nn4pR6vs/xUFSFvO1qQeSomtyI5k2qSprDY6U/6iUvW2l+vvdQuustyhYq+t+rmgTtPSAdrjBpkTIxomR+gnN/jDX3GSoQGnzbciATOQ8DD2LxnQpElL/kjK1T+p57DhOwAz3NtA2R0YYM5/QOgNjt/bOQ0X7Vb8vdU+26nvH0ysXiBb5Av5SmKyS/bJMTklHcKIIo/kJ/kVPAUvwe/gz6u0FkxqNsmbqM3/BdEnsQE=</latexit>

Topology Tree

Nonterminal Values

Boundary values

Mabc<latexit sha1_base64="XCsL0FGvr8BOccsQqxx9Wjb2+Q0=">AAACSnicfZDLahsxFIY1btqm00uSZtmNiDGU4pqZJNAuQ9NFNiEJxLHBM5gz8hlHWJdB0pSawQ/RbfNKfYG8Rnalm2gmhlwMPSD4dPSfi/6sENy6KLoOWs/Wnr94uf4qfP3m7buNza33F1aXhmGfaaHNMAOLgivsO+4EDguDIDOBg2x2WL8PfqCxXKtzNy8wlTBVPOcMnE8NjscVZGwx3mxHvagJugrxEtpkGafjreBzMtGslKgcE2DtKI4Kl1ZgHGcCF2FSWiyAzWCKI48KJNq0avZd0I7PTGiujT/K0Sb7sKICaSW4yy71UEtsQ3Yusy7NZHPRhfKNatXjWS7/mlZcFaVDxe5G5aWgTtP6+3TCDTIn5jRMvqPf3OCxb3FSoAGnzacqATOV8HPhfzKlSZfW/D8pV/dSz2HHTwBmuLeBskswwJx3P/QGx0/tXIWL3V6819s9228ffFtavU4+kB3ykcTkCzkgR+SU9AkjM/KL/CZXwZ/gJvgb/LuTtoJlzTZ5FK21W7NMsAM=</latexit>

OaA<latexit sha1_base64="9F7ounoMcyA4M4yy0SVKxP9sH4g=">AAACSXicfZDdahNBFMdnU7Xt+pXUS28GQ0Akht1aaC9T9cIbSQSTFLohnJ2cTaeZj2VmtjQseQdv9ZV8gj5G74pXzm4DWgseGPjNmf/5mH+aC25dFF0Fja0HDx9t7+yGj588ffa82dobW10YhiOmhTYnKVgUXOHIcSfwJDcIMhU4SZcfqvfJBRrLtfrqVjlOJSwUzzgD51PjwayE4/Ws2Y56UR30PsQbaJNNDGet4G0y16yQqBwTYO1pHOVuWoJxnAlch0lhMQe2hAWeelQg0U7Let017fjMnGba+KMcrbN/V5QgrQR31qUeKomtya5k2qWprC86V75Rpbo7y2VH05KrvHCo2O2orBDUaVr9ns65QebEiobJR/SbG/zsWwxyNOC0eVMmYBYSLtf+JwuadGnF/5Ny9UfqOez4CcAM9zZQdgYGmPPmh97g+F8778N4vxe/6+1/OWj332+s3iEvySvymsTkkPTJJzIkI8LIOflGvpMfwc/gOrgJft1KG8Gm5gW5E42t35fPr3c=</latexit>

Nonterminal factor weights

Terminal factor weights

G = hM,Oi<latexit sha1_base64="OJ0WKKiaFYiM2Us+MGAhSsiEKRY=">AAACZnicfZHNahRBEMd7x4/EUZNNguTgpXQJiGyWmSjESyCoEC8hEdwkkFmWmt7aSZP+GLp7xGXYs0/j1TxL3sDHsGeyoDFgQcOvqv/dVf3vvJTC+SS57kT37j94uLT8KH785OnKandt/cSZynIaciONPcvRkRSahl54SWelJVS5pNP88kOzf/qVrBNGf/GzkkYKCy2mgqMPpXH3RabQX3CUcAB7kEnUhSQ47MMRZLZNxt1eMkjagLuQLqDHFnE8XutsZxPDK0Xac4nOnadJ6Uc1Wi+4pHmcVY5K5JdY0HlAjYrcqG7fMoetUJnA1NiwtIe2+veJGpVrRu5DgEbiWnIzlfchV21iSh0ualS3e/npu1EtdFl50vym1bSS4A001sBEWOJeziDOPlKY3NJhuOKoJIve2Nd1hrZQ+G0eXlJA1oeG/ycV+o80cLwVOiC3ItgA/AItch9+Jg4Gp//aeRdOdgbpm8HO57e9/fcLq5fZc/aSvWIp22X77BM7ZkPG2Xf2g/1kV51f0Ur0LNq8kUadxZkNdisi+A15FLZX</latexit>

⌦T<latexit sha1_base64="y23BZItOEmsRye8unrtxI0XKjAw=">AAACS3icfZDdShtBFMdnY9V021o/LnszNAilpGHXCnoZ1AtvRAtGA24IZycn6+B8LDOzYljyEt7qK/kAPod34kVn14BaoQcGfnPmfz7mn+aCWxdF90Fj7sP8wmLzY/jp85elr8srqydWF4Zhj2mhTT8Fi4Ir7DnuBPZzgyBTgafpxW71fnqJxnKtjt0kx4GETPExZ+B8qp8cSsxgeDxcbkWdqA76HuIZtMgsjoYrwa9kpFkhUTkmwNqzOMrdoATjOBM4DZPCYg7sAjI886hAoh2U9cJTuu4zIzrWxh/laJ19XVGCtBLceZt6qCS2JjuRaZumsr7oXPlGlertLDfeHpRc5YVDxZ5HjQtBnabV/+mIG2ROTGiY7KHf3OCBb3GYowGnzc8yAZNJuJr6n2Q0adOK/yfl6kXqOVz3E4AZ7m2g7BwMMOftD73B8b92voeTjU78u7PxZ7PV3ZlZ3STfyHfyg8Rki3TJPjkiPcKIINfkhtwGd8FD8Bg8PUsbwaxmjbyJxvxfMXGwOw==</latexit>

Internal factors

Boundary factors @⌦T<latexit sha1_base64="XQUncPf6A9PmtYVJ65nYvk0ndBw=">AAACVnicfZFdSxwxFIYzU606/dptL70JLkIp22XGFtpLab3wRlRwVXCW5Uz27BjMF0mm7TLsT+lt/Uv1z4iZccEv8EDgycl7zkneFEZw59P0KopfLC2/XFldS169fvP2Xaf7/tjpyjIcMi20PS3AoeAKh557gafGIshC4Elx8bM5P/mF1nGtjvzM4EhCqfiUM/AhNe50cwPWcxA035dYwvho3Omlg7QN+hSyBfTIIg7G3ehzPtGskqg8E+DcWZYaP6qbvkzgPMkrhwbYBZR4FlCBRDeq27vP6WbITOhU27CUp232fkUN0knw530aoJG4ltxMFn1ayHajjQqNGtXDWX76fVRzZSqPit2OmlaCek0bK+iEW2RezGiS72C4ucW90GLfoAWv7ac6B1tK+DMPLylp3qcNPyfl6k4aONkME4BZHmyg7BwsMB9+IgkGZ4/tfArHW4Psy2Dr8Gtv+8fC6lWyTjbIR5KRb2Sb7JIDMiSM/CZ/yT9yGf2PruPleOVWGkeLmg/kQcSdG1AdsjE=</latexit>

o = {oi}<latexit sha1_base64="oVnvzMSzghGPeymtenKyP+EPrAI=">AAACTHicfZBdSxtBFIZno0bdVqv10pvBIJQSw64V2puCaC+8kVpoPko2hLOTk3VwPpaZ2dKw5Ff0tv1Lve//8E4EZ9dAawM9MPDMmfd8zJvmglsXRb+DxsrqWnN9YzN89nxr+8XO7sue1YVh2GVaaDNIwaLgCruOO4GD3CDIVGA/vTmv3vtf0Viu1Wc3y3EkIVN8yhk4n/qi3yelHvNkPt5pRZ2oDroM8QJaZBFX493gKJloVkhUjgmwdhhHuRuVYBxnAudhUljMgd1AhkOPCiTaUVlvPKeHPjOhU238UY7W2b8rSpBWgrtuUw+VxNZkZzJt01TWF50r36hSPZ3lpu9GJVd54VCxx1HTQlCnaWUAnXCDzIkZDZMP6Dc3eOlbfMzRgNPmdZmAySR8m/ufZDRp04r/J+Xqj9RzeOgnADPc20DZNRhgzvsfeoPjf+1cht5xJ37TOf500jo9W1i9QfbJAXlFYvKWnJILckW6hBFJvpMf5GfwK7gN7oL7R2kjWNTskSfRaD4AfL6w4A==</latexit>

� = {�i}<latexit sha1_base64="cGwiJ9CrhSwZFBYW4GUOFVS2dtA=">AAACWHicfZFdSxtBFIZP1taPbatRL70ZGoRS0rCrQnsjiPXCG6mFRgU3hLOTk2RwPpaZ2dKw5Ld42/6k+muc3QRaK/TAwDNn3nPOzDt5IYXzSfK7Fa28eLm6tr4Rv3r9ZnOrvb1z5UxpOfW5kcbe5OhICk19L7ykm8ISqlzSdX73uT6//k7WCaO/+VlBA4UTLcaCow+pYXs3c2Ki8DirFjAU2XzY7iS9pAn2HNIldGAZl8Pt1odsZHipSHsu0bnbNCn8oELrBZc0j7PSUYH8Did0G1CjIjeomtvP2X7IjNjY2LC0Z03274oKlVPop10WoJa4htxM5V2Wq2ZjCh0a1aqns/z406ASuig9ab4YNS4l84bVZrCRsMS9nLE4O6Nwc0sXocWXgix6Y99XGdrgyY95eMmEZV1W8/+kQv+RBo73wwTkVgQbGJ+iRe7DX8TB4PRfO5/D1UEvPewdfD3qnJwurV6HPXgL7yCFj3AC53AJfeAwg3v4Cb9aDxFEa9HGQhq1ljW78CSinUe7ibJ0</latexit>

W (�, o|T ,G) =Y

↵2⌦T

M�↵1�↵2�↵3

Y

↵2@⌦T

O�↵1o↵2

<latexit sha1_base64="AN9sLkquMWCattIwo4UXK6ty+Ig=">AAADAHicfVLdbtMwGHXC3wh/HVxyY6gmDRSqpEOCG6QJkMbN1E5q10lLFX1x3dSaY1u2g6hCbnga7hC38CS8DU5WxtYiPsnS8fHx+exjZ4ozY6Pol+dfu37j5q2t28Gdu/fuP+hsPzw2stSEjonkUp9kYChngo4ts5yeKE2hyDidZGfvmvXJR6oNk2Jkl4pOC8gFmzMC1lFp5+dkNzEsLyDE8nOQFGAXBDgehfgCHzzDb3CitJylVQJcLSBhAieDguaQjupDx7YOf1bTuF4j+uvEXl1vGirQlgG/MB78w1he8qzTTjfqRW3hTRCvQBetaphuey+SmSRlQYUlHIw5jSNlp1XTmHBaB0lpqAJyBjk9dVBAQc20alOu8Y5jZngutRvC4pa9vKOCwjSZhdiBRmJaZJZFFuKsaCdSCWfUqK72svPX04oJVVoqyHmrecmxlbh5NDxjmhLLlzhI3lN3ck0PncVAUQ1W6ucuE+1y+lS7m+Q4CXGD/ydl4q/U4WDHdQCimYsBkwVoINb9mcAFHK/HuQmO+714r9c/etndf7uKegs9Rk/RLorRK7SPPqAhGiPiPfEOvKF35H/xv/rf/O/nUt9b7XmErpT/4zf8j/Mx</latexit>

Probabilistic CFG with Chomsky normal form

T<latexit sha1_base64="zWgln8AnNj5hKs+8HTb3VDPXnws=">AAACTXicfZFLSyNBEMd74voaH+vj6KXZEFiWGGaygh5FPXhZVDAqOkFqOpXYpB9Dd48YhnwLr7tfybMfxNuybM8Y2NWABQ2/rv5XV/W/00xw66LoOajNfJqdm19YDJeWV1Y/r61vXFidG4YdpoU2VylYFFxhx3En8CozCDIVeJkOD8vzy3s0lmt17kYZdiUMFO9zBs6nrhMJ7o6BoOe3a/WoFVVBpyGeQJ1M4vR2PdhOeprlEpVjAqy9iaPMdQswjjOB4zDJLWbAhjDAG48KJNpuUY08pg2f6dG+Nn4pR6vs/xUFSFvO1qQeSomtyI5k2qSprDY6U/6iUvW2l+vvdQuustyhYq+t+rmgTtPSAdrjBpkTIxomR+gnN/jDX3GSoQGnzbciATOQ8DD2LxnQpElL/kjK1T+p57DhOwAz3NtA2R0YYM5/QOgNjt/bOQ0X7Vb8vdU+26nvH0ysXiBb5Av5SmKyS/bJMTklHcKIIo/kJ/kVPAUvwe/gz6u0FkxqNsmbqM3/BdEnsQE=</latexit>

Topology Tree

↵ = (↵1,↵2,↵3)<latexit sha1_base64="8a/OjWysiUZqAuzT0Tiy0+YDDpI=">AAACanicfZHdahNBFMcn26p1/UrrlfRmMAaqxLCbCu1NoagX3pRWMG2hG8LZyUkydL6YmS0NS57Ap/FWn8R38CGc3WzRWvDAwG/O/M85M//JjeDOJ8nPVrS2fu/+g42H8aPHT54+a29unTpdWIZDpoW25zk4FFzh0HMv8NxYBJkLPMsvP1TnZ1doHdfqi18YHEmYKT7lDHxIjdvdDISZAz2gOysap70GBjew+3rc7iT9pA56F9IGOqSJk/Fm62020ayQqDwT4NxFmhg/KsF6zgQu46xwaIBdwgwvAiqQ6EZl/Z4l7YbMhE61DUt5Wmf/rihBOgl+3qMBKomryS1k3qO5rDfaqNCoUt2e5af7o5IrU3hUbDVqWgjqNa3soRNukXmxoHH2EcPNLR6FFscGLXht35QZ2JmE62V4yYxmPVrx/6Rc/ZEGjrthAjDLgw2UzcEC8+F34mBw+q+dd+F00E93+4PP7zqH7xurN8g2eUl2SEr2yCH5RE7IkDDylXwj38mP1q9oK3oRba+kUaupeU5uRfTqNwdWt/o=</latexit>

�↵1 ! �↵2�↵3<latexit sha1_base64="OJ/4XFBME2wiJCcSPADJ3ilt5jI=">AAAChHicfVFNb9NAEN24UIr5aFqOXFZElRAKkZ2A4AQVcOCCKBJpK9WRNd5MnFX3S7trILL8c/g1XNsD/4a1GwmaSoy00ts3b2Z23hZGcOeT5Hcv2rp1e/vOzt343v0HD3f7e/vHTleW4ZRpoe1pAQ4FVzj13As8NRZBFgJPivP3bf7kG1rHtfrqVwZnEkrFF5yBD1Tef5s5XkrI6wyEWUKeNpnl5dKDtfo73UiOmw1i0uT9QTJKuqA3QboGA7KOo3yv9zyba1ZJVJ4JcO4sTYyf1WA9ZwKbOKscGmDnUOJZgAokulndbdrQg8DM6ULbcJSnHftvRQ3SSfDLIQ2glbgOuZUshrSQ3UUbFRq1quuz/OL1rObKVB4Vuxq1qAT1mrbG0Tm3yLxY0Tj7gOHlFj+FFp8NWvDaPguW2GDNjyZsUtJsSFv8PylXf6UBxwdhAjDLgw2ULcEC8+Hf4mBwumnnTXA8HqWT0fjLi8Hhu7XVO+QxeUKekpS8IofkIzkiU8LIT/KLXJDLaDsaRpPo5ZU06q1rHpFrEb35A2DTxIg=</latexit>

�↵1 ! o↵2<latexit sha1_base64="2tbvnLecRsMg66Fsc6j/m2eqrwA=">AAACbnicfZHdahQxFMezY9U6fm0VvCnF4FKosi4zq6CXRXvRm2IFty10luFM9uxsaL5IMuoyzDP4NL2tz+Fb+AhmpotaCx4I/HLyP+ck/xRGcOeT5EcvurF289bt9Tvx3Xv3Hzzsbzw6crqyDCdMC21PCnAouMKJ517gibEIshB4XJy9b8+PP6N1XKtPfmlwKqFUfM4Z+JDK+88zx0sJeZ2BMAvI0yazvFx4sFZ/ofp3ftzk/UEySrqg1yFdwYCs4jDf6L3MZppVEpVnApw7TRPjpzVYz5nAJs4qhwbYGZR4GlCBRDetuzc1dDtkZnSubVjK0y77d0UN0knwiyEN0EpcR24piyEtZLfRRoVGrerqLD9/O625MpVHxS5HzStBvaatRXTGLTIvljTO9jDc3OJBaPHBoAWv7YtgiQ2WfW3CS0qaDWnL/5Ny9UcaON4OE4BZHmygbAEWmA8/FAeD03/tvA5H41H6ajT++Hqw+25l9TrZJM/IDknJG7JL9skhmRBGvpFzckG+935GT6Kt6OmlNOqtah6TKxHt/AIw7rv1</latexit>

↵ = (↵1,↵2)<latexit sha1_base64="2vwlE72+GGVbes3t2DuJyDwTuXU=">AAACYnicfZFNaxsxEIbl7UeS7Ufs5pgWRE0gLY7ZdQvtpRDSHnIJTaFOAlljZuWxLaIvJG2pWXzqr8k1+TW554dUu15o00AHBI9G78xIr3IjuPNJctOKHjx89HhtfSN+8vTZ881258WJ04VlOGRaaHuWg0PBFQ499wLPjEWQucDT/OJzdX76A63jWn33C4MjCTPFp5yBD6lx+1UGwsyBfqK7KxqnPdrQ4M243U36SR30PqQNdEkTx+NOay+baFZIVJ4JcO48TYwflWA9ZwKXcVY4NMAuYIbnARVIdKOyfseS7oTMhE61DUt5Wmf/rihBOgl+3qMBKomryS1k3qO5rDfaqNCoUt2d5acfRyVXpvCo2GrUtBDUa1rZQifcIvNiQePsC4abWzwKLb4atOC1fVtmYGcSfi7DS2Y069GK/yfl6o80cLwTJgCzPNhA2RwsMB9+JQ4Gp//aeR9OBv30XX/w7X13/6Cxep1sk9dkl6TkA9knh+SYDAkjv8gluSLXrdsojjrR1koatZqaLXInope/Ad/ctKo=</latexit>

W (�, o|T ,G) =Y

↵2⌦T

M�↵1�↵2�↵3

Y

↵2@⌦T

O�↵1o↵2

<latexit sha1_base64="AN9sLkquMWCattIwo4UXK6ty+Ig=">AAADAHicfVLdbtMwGHXC3wh/HVxyY6gmDRSqpEOCG6QJkMbN1E5q10lLFX1x3dSaY1u2g6hCbnga7hC38CS8DU5WxtYiPsnS8fHx+exjZ4ozY6Pol+dfu37j5q2t28Gdu/fuP+hsPzw2stSEjonkUp9kYChngo4ts5yeKE2hyDidZGfvmvXJR6oNk2Jkl4pOC8gFmzMC1lFp5+dkNzEsLyDE8nOQFGAXBDgehfgCHzzDb3CitJylVQJcLSBhAieDguaQjupDx7YOf1bTuF4j+uvEXl1vGirQlgG/MB78w1he8qzTTjfqRW3hTRCvQBetaphuey+SmSRlQYUlHIw5jSNlp1XTmHBaB0lpqAJyBjk9dVBAQc20alOu8Y5jZngutRvC4pa9vKOCwjSZhdiBRmJaZJZFFuKsaCdSCWfUqK72svPX04oJVVoqyHmrecmxlbh5NDxjmhLLlzhI3lN3ck0PncVAUQ1W6ucuE+1y+lS7m+Q4CXGD/ydl4q/U4WDHdQCimYsBkwVoINb9mcAFHK/HuQmO+714r9c/etndf7uKegs9Rk/RLorRK7SPPqAhGiPiPfEOvKF35H/xv/rf/O/nUt9b7XmErpT/4zf8j/Mx</latexit>

Internal factors

Boundary factors

Mabc = P(a ! bc|a ! nonterminal)<latexit sha1_base64="ewJnIxk+zxfZRFemIAE01g3vF+Y=">AAACiXicfZDbahsxEIbl7Snd9OC0l70RNYG0uGY3LSQEAqHJRW9CXaiTQNaYkTxei+iwSNokZrMv1Kfpbfo01W4MrRvogODT6J8Zzc8KKZxPkttO9ODho8dP1p7G68+ev3jZ3Xh14kxpOY64kcaeMXAohcaRF17iWWERFJN4yi4Om/fTS7ROGP3dLwocK8i1mAkOPqQm3aPjSQWM13SfZgr8nDE63ILMinzuwVpzRRm/WblnHq99pY32aJXQIOt3k24vGSRt0PuQLqFHljGcbHQ+ZFPDS4XacwnOnadJ4ccVWC+4xDrOSocF8AvI8TygBoVuXLXr1nQzZKZ0Zmw42tM2+3dFBco1u/RpgEbiWnILxfqUqfZiCh0aNarVWX62O66ELkqPmt+NmpWSekMb9+hUWOReLmicHWH4ucXj0OJrgRa8se+rDGyu4LoOm+Q069OG/ycV+o80cLwZJgC3IthA+Rws8OCyi4PB6b923oeT7UH6cbD97VPv4PPS6jXyhrwlWyQlO+SAfCFDMiKc/CA/yS35Fa1HabQb7d1Jo86y5jVZiejwN6Qcxe8=</latexit>

OaA = P(a ! A|a ! terminal)<latexit sha1_base64="22JdU58GFv0l+FfIr7Lci3Fb2Q8=">AAAChHicfZBNbxMxEIadBUpZvlI4crGIKhUUot0WBBegBQ5cUINE2krdKJp1Jhur/ljZE2i07M/h13CFA/8G7zYShEqMZOnx+J0Zz5uXSnpKkl+d6MrVaxvXN2/EN2/dvnO3u3XvyNuFEzgSVll3koNHJQ2OSJLCk9Ih6FzhcX72tnk//ozOS2s+0bLEsYbCyJkUQCE16b4+nFRwUPOXPNNA8zznwx3InCzmBM7ZL/zg69o1IzynitBpaUDVjybdXjJI2uCXIV1Bj61iONnqPMmmViw0GhIKvD9Nk5LGFTiSQmEdZwuPJYgzKPA0oAGNfly1m9Z8O2SmfGZdOIZ4m/27ogLtmz36PEAj8S35pc77PNftxZYmNGpU67No9mJcSVMuCI24GDVbKE6WN8bxqXQoSC15nL3D8HOHH0KLwxIdkHWPqwxcoeG8DpsUPOvzhv8nleaPNHC8HSaAcDLYwMUcHIjgso+Dwem/dl6Go91BujfY/fi0t/9mZfUme8Aesh2Wsudsn71nQzZign1j39kP9jPaiPrRXvTsQhp1VjX32VpEr34DBujDbA==</latexit>

Random Language Model

W (�, o|T ,G) =Y

↵2⌦T

M�↵1�↵2�↵3

Y

↵2@⌦T

O�↵1o↵2

<latexit sha1_base64="AN9sLkquMWCattIwo4UXK6ty+Ig=">AAADAHicfVLdbtMwGHXC3wh/HVxyY6gmDRSqpEOCG6QJkMbN1E5q10lLFX1x3dSaY1u2g6hCbnga7hC38CS8DU5WxtYiPsnS8fHx+exjZ4ozY6Pol+dfu37j5q2t28Gdu/fuP+hsPzw2stSEjonkUp9kYChngo4ts5yeKE2hyDidZGfvmvXJR6oNk2Jkl4pOC8gFmzMC1lFp5+dkNzEsLyDE8nOQFGAXBDgehfgCHzzDb3CitJylVQJcLSBhAieDguaQjupDx7YOf1bTuF4j+uvEXl1vGirQlgG/MB78w1he8qzTTjfqRW3hTRCvQBetaphuey+SmSRlQYUlHIw5jSNlp1XTmHBaB0lpqAJyBjk9dVBAQc20alOu8Y5jZngutRvC4pa9vKOCwjSZhdiBRmJaZJZFFuKsaCdSCWfUqK72svPX04oJVVoqyHmrecmxlbh5NDxjmhLLlzhI3lN3ck0PncVAUQ1W6ucuE+1y+lS7m+Q4CXGD/ydl4q/U4WDHdQCimYsBkwVoINb9mcAFHK/HuQmO+714r9c/etndf7uKegs9Rk/RLorRK7SPPqAhGiPiPfEOvKF35H/xv/rf/O/nUt9b7XmErpT/4zf8j/Mx</latexit>

W = e�E<latexit sha1_base64="f0FdXArNQqbLAMugGruIIYE7quE=">AAACTXicfZFdSxwxFIYza606av3opTehiyCyLjMq6I0g1oI3pRa6ruisciZ7djeYjyHJiMuw/8Jb/Uu99od4J6WZcaFVwQOBJyfvOSd5k2aCWxdFD0Ft4sPkx6npmXB2bv7TwuLS8onVuWHYYlpoc5qCRcEVthx3Ak8zgyBTge306mt53r5GY7lWv9www46EvuI9zsD51Fmb7lG8KDa+jS4X61EzqoK+hXgMdTKO48ulYCPpapZLVI4JsPY8jjLXKcA4zgSOwiS3mAG7gj6ee1Qg0XaK6sojuuozXdrTxi/laJX9v6IAaSW4QYN6KCW2IjuUaYOmstroTPlGperlLNfb7RRcZblDxZ5H9XJBnaalA7TLDTInhjRMDtHf3OB33+JHhgacNutFAqYv4WbkX9KnSYOW/J6Uq39Sz+GqnwDMcG8DZQMwwJz/gNAbHL+28y2cbDbjrebmz+36/sHY6mmyQr6QNRKTHbJPjsgxaRFGFLkld+Q++B08Bk/Bn2dpLRjXfCYvojb1F46BsFg=</latexit>

We can write , where

E = �X

a,b,c

⇡abc(�) logMabc �X

a,B

⇢aB(�, o) logOaB

<latexit sha1_base64="AJ05Rh5VhbPwPusGw/7j5U0uBZs=">AAACnHicfZFbaxQxGIaz46mOh271UpDgUqgyXWZqod4IZa0gyNIK7m6hWZZM9tvZ0JxIMuIyzE/zh3jtrf4HM7ProRb8IPDkzZt8yZvcCO58mn7tRDdu3rp9Z+tufO/+g4fb3Z1HY6dLy2DEtND2PKcOBFcw8twLODcWqMwFTPLLN8365BNYx7X66FcGppIWii84oz5Is+7kLX6N94kr5ayiSZ6wmhgeMGf1HnG8kPQ5EbrAw7X22zmoiV3qQINfvkSvnaetOOv20n7aFr4O2QZ6aFNns53OPplrVkpQngnq3EWWGj+tqPWcCahjUjowlF3SAi4CKirBTas2gRrvBmWOF9qGoTxu1b93VFQ6Sf0ywQEai2vJrWSe4Fy2E21UOKhxXe3lF6+mFVem9KDYutWiFNhr3ASK59wC82KFY3IC4eYWhuGIUwOWem1fVITakM7nOrykwCTBDf/PytUfa+B4N3SgzPIQA2ZLainz4T/jEHD2b5zXYXzQz172Dz4c9o4Hm6i30BP0DO2hDB2hY/QOnaERYugL+oa+ox/R0+gkeh8N19aos9nzGF2paPwTVFfL/g==</latexit>

a ! B<latexit sha1_base64="pH521+1Eq4fuFTNlQkoH0qv1Jhs=">AAACUXicfZFNSxxBEIZrJl86JlGTYy6NixDCZpkxgeQoxoOXoIKrgrORmt7a2cb+GLp7osuw/yNX85c85afkZs+4kBghBQ1PV79V1f12UUnhfJr+iuJHj588fba0nKw8f/FydW391bEzteU05EYae1qgIyk0Db3wkk4rS6gKSSfFxZf2/OQ7WSeMPvKzikYKSy0mgqMPqW+YW1FOPVprLtnO+VovHaRdsIeQLaAHizg4X4/e52PDa0Xac4nOnWVp5UcNWi+4pHmS144q5BdY0llAjYrcqOmuPWebITNmE2PD0p512b8rGlROoZ/2WYBW4jpyM1X0WaG6jal0aNSq7s/yk8+jRuiq9qT53ahJLZk3rHWBjYUl7uWMJfkuhZtb+hpa7Fdk0Rv7rsnRlgqv5uElJcv7rOX/SYX+Iw2cbIYJyK0INjA+RYvch09IgsHZv3Y+hOOtQfZhsHX4sbe9s7B6Cd7ABryFDD7BNuzBAQyBg4UfcA0/o5vodwxxfCeNo0XNa7gX8cotenaw6Q==</latexit>

a ! bc<latexit sha1_base64="Uf4idaQBMDwNdbKP0PVHpPChkmg=">AAACVHicfZHPahRBEMZ7JkbjGJONHr00LoEQNstMFPQY1IOXkAhuEsgsS01v7WyT/jN016jLsE/iVV9J8F08pGeyEGPAgoZfV39V1f11USnpKU1/R/Hag/WHjzYeJ082n25t93aenXlbO4EjYZV1FwV4VNLgiCQpvKgcgi4UnhdX79vz8y/ovLTmMy0qHGsojZxJARRSk9425E6WcwLn7FdeiEmvnw7TLvh9yFbQZ6s4nexEB/nUilqjIaHA+8ssrWjcgCMpFC6TvPZYgbiCEi8DGtDox0138yXfDZkpn1kXliHeZf+uaEB7DTQf8ACtxHfkF7oY8EJ3G1uZ0KhV3Z1Fs7fjRpqqJjTiZtSsVpwsb43gU+lQkFrwJP+A4eYOj0OLkwodkHX7TQ6u1PBtGV5S8nzAW/6fVJpbaeBkN0wA4WSwgYs5OBAU/iEJBmf/2nkfzg6H2avh4afX/aN3K6s32Av2ku2xjL1hR+wjO2UjJljNvrMf7Gf0K/oTr8XrN9I4WtU8Z3ci3roGHe+xpw==</latexit>

count in �<latexit sha1_base64="7LT0udhEVZZV+LmVTeh8BzUe0u8=">AAACSXicfZDdShtBFMdnY9W4fuulN0ODIBLDrgp6KeqFN6UpNFFwg5ydnMTR+VhmZsWw5B28bV+pT9DH8K70qrObQJsKHhj4zZn/+Zh/mgluXRT9DGpzH+YXFutL4fLK6tr6xuZW1+rcMOwwLbS5ScGi4Ao7jjuBN5lBkKnA6/Txony/fkJjuVZf3SjDnoSh4gPOwPlUN7F8KOFuoxG1oiroW4in0CDTaN9tBgdJX7NconJMgLW3cZS5XgHGcSZwHCa5xQzYIwzx1qMCibZXVOuO6a7P9OlAG3+Uo1X234oCpJXg7pvUQymxFdmRTJs0ldVFZ8o3KlWzs9zgtFdwleUOFZuMGuSCOk3L39M+N8icGNEwuUS/ucFPvsXnDA04bfaLBIz343nsfzKkSZOW/J6Uq79Sz+GunwDMcG8DZfdggDlvfugNjv+38y10D1vxUevwy3Hj7HxqdZ3skI9kj8TkhJyRK9ImHcLIA3kh38j34EfwGvwKfk+ktWBas01mojb3B93Cr5w=</latexit> count in �, o

<latexit sha1_base64="9MvfuupPb0QI7gkjYTXwHZS/tcc=">AAACTHicfZDLahsxFIY1Tpu400vsZNmNqAmU4poZt5AsTZNFNiUO1E6Kx4Qz8rEjrMsgaULN4KfItn2l7vse2ZVCNWND6xp6QPDp6D8X/WkmuHVR9COo7Tx6vLtXfxI+ffb8xX6jeTC0OjcMB0wLba5TsCi4woHjTuB1ZhBkKvAqnZ+W71d3aCzX6pNbZDiWMFN8yhk4n/qcWD6T0Kb6ptGKOlEVdBviNbTIOvo3zeBtMtEsl6gcE2DtKI4yNy7AOM4ELsMkt5gBm8MMRx4VSLTjotp4SY98ZkKn2vijHK2yf1cUIK0Ed9umHkqJrcguZNqmqawuOlO+UananOWmJ+OCqyx3qNhq1DQX1GlaGkAn3CBzYkHD5Az95gY/+hYXGRpw2rwpEjDeki9L/5MZTdq05P9Jufoj9Rwe+QnADPc2UHYLBpjz/ofe4PhfO7dh2O3E7zrdy/et3oe11XXykrwir0lMjkmPnJM+GRBGJLknX8m34HvwEPwMfq2ktWBdc0g2orb7G7QjsHU=</latexit>

Random Language Model

W (�, o|T ,G) =Y

↵2⌦T

M�↵1�↵2�↵3

Y

↵2@⌦T

O�↵1o↵2

<latexit sha1_base64="AN9sLkquMWCattIwo4UXK6ty+Ig=">AAADAHicfVLdbtMwGHXC3wh/HVxyY6gmDRSqpEOCG6QJkMbN1E5q10lLFX1x3dSaY1u2g6hCbnga7hC38CS8DU5WxtYiPsnS8fHx+exjZ4ozY6Pol+dfu37j5q2t28Gdu/fuP+hsPzw2stSEjonkUp9kYChngo4ts5yeKE2hyDidZGfvmvXJR6oNk2Jkl4pOC8gFmzMC1lFp5+dkNzEsLyDE8nOQFGAXBDgehfgCHzzDb3CitJylVQJcLSBhAieDguaQjupDx7YOf1bTuF4j+uvEXl1vGirQlgG/MB78w1he8qzTTjfqRW3hTRCvQBetaphuey+SmSRlQYUlHIw5jSNlp1XTmHBaB0lpqAJyBjk9dVBAQc20alOu8Y5jZngutRvC4pa9vKOCwjSZhdiBRmJaZJZFFuKsaCdSCWfUqK72svPX04oJVVoqyHmrecmxlbh5NDxjmhLLlzhI3lN3ck0PncVAUQ1W6ucuE+1y+lS7m+Q4CXGD/ydl4q/U4WDHdQCimYsBkwVoINb9mcAFHK/HuQmO+714r9c/etndf7uKegs9Rk/RLorRK7SPPqAhGiPiPfEOvKF35H/xv/rf/O/nUt9b7XmErpT/4zf8j/Mx</latexit>

W = e�E<latexit sha1_base64="f0FdXArNQqbLAMugGruIIYE7quE=">AAACTXicfZFdSxwxFIYza606av3opTehiyCyLjMq6I0g1oI3pRa6ruisciZ7djeYjyHJiMuw/8Jb/Uu99od4J6WZcaFVwQOBJyfvOSd5k2aCWxdFD0Ft4sPkx6npmXB2bv7TwuLS8onVuWHYYlpoc5qCRcEVthx3Ak8zgyBTge306mt53r5GY7lWv9www46EvuI9zsD51Fmb7lG8KDa+jS4X61EzqoK+hXgMdTKO48ulYCPpapZLVI4JsPY8jjLXKcA4zgSOwiS3mAG7gj6ee1Qg0XaK6sojuuozXdrTxi/laJX9v6IAaSW4QYN6KCW2IjuUaYOmstroTPlGperlLNfb7RRcZblDxZ5H9XJBnaalA7TLDTInhjRMDtHf3OB33+JHhgacNutFAqYv4WbkX9KnSYOW/J6Uq39Sz+GqnwDMcG8DZQMwwJz/gNAbHL+28y2cbDbjrebmz+36/sHY6mmyQr6QNRKTHbJPjsgxaRFGFLkld+Q++B08Bk/Bn2dpLRjXfCYvojb1F46BsFg=</latexit>

We can write , where

E = �X

a,b,c

⇡abc(�) logMabc �X

a,B

⇢aB(�, o) logOaB

<latexit sha1_base64="AJ05Rh5VhbPwPusGw/7j5U0uBZs=">AAACnHicfZFbaxQxGIaz46mOh271UpDgUqgyXWZqod4IZa0gyNIK7m6hWZZM9tvZ0JxIMuIyzE/zh3jtrf4HM7ProRb8IPDkzZt8yZvcCO58mn7tRDdu3rp9Z+tufO/+g4fb3Z1HY6dLy2DEtND2PKcOBFcw8twLODcWqMwFTPLLN8365BNYx7X66FcGppIWii84oz5Is+7kLX6N94kr5ayiSZ6wmhgeMGf1HnG8kPQ5EbrAw7X22zmoiV3qQINfvkSvnaetOOv20n7aFr4O2QZ6aFNns53OPplrVkpQngnq3EWWGj+tqPWcCahjUjowlF3SAi4CKirBTas2gRrvBmWOF9qGoTxu1b93VFQ6Sf0ywQEai2vJrWSe4Fy2E21UOKhxXe3lF6+mFVem9KDYutWiFNhr3ASK59wC82KFY3IC4eYWhuGIUwOWem1fVITakM7nOrykwCTBDf/PytUfa+B4N3SgzPIQA2ZLainz4T/jEHD2b5zXYXzQz172Dz4c9o4Hm6i30BP0DO2hDB2hY/QOnaERYugL+oa+ox/R0+gkeh8N19aos9nzGF2paPwTVFfL/g==</latexit>

analogous to coupling constants in physics

Random Language Model

W (�, o|T ,G) =Y

↵2⌦T

M�↵1�↵2�↵3

Y

↵2@⌦T

O�↵1o↵2

<latexit sha1_base64="AN9sLkquMWCattIwo4UXK6ty+Ig=">AAADAHicfVLdbtMwGHXC3wh/HVxyY6gmDRSqpEOCG6QJkMbN1E5q10lLFX1x3dSaY1u2g6hCbnga7hC38CS8DU5WxtYiPsnS8fHx+exjZ4ozY6Pol+dfu37j5q2t28Gdu/fuP+hsPzw2stSEjonkUp9kYChngo4ts5yeKE2hyDidZGfvmvXJR6oNk2Jkl4pOC8gFmzMC1lFp5+dkNzEsLyDE8nOQFGAXBDgehfgCHzzDb3CitJylVQJcLSBhAieDguaQjupDx7YOf1bTuF4j+uvEXl1vGirQlgG/MB78w1he8qzTTjfqRW3hTRCvQBetaphuey+SmSRlQYUlHIw5jSNlp1XTmHBaB0lpqAJyBjk9dVBAQc20alOu8Y5jZngutRvC4pa9vKOCwjSZhdiBRmJaZJZFFuKsaCdSCWfUqK72svPX04oJVVoqyHmrecmxlbh5NDxjmhLLlzhI3lN3ck0PncVAUQ1W6ucuE+1y+lS7m+Q4CXGD/ydl4q/U4WDHdQCimYsBkwVoINb9mcAFHK/HuQmO+714r9c/etndf7uKegs9Rk/RLorRK7SPPqAhGiPiPfEOvKF35H/xv/rf/O/nUt9b7XmErpT/4zf8j/Mx</latexit>

W = e�E<latexit sha1_base64="f0FdXArNQqbLAMugGruIIYE7quE=">AAACTXicfZFdSxwxFIYza606av3opTehiyCyLjMq6I0g1oI3pRa6ruisciZ7djeYjyHJiMuw/8Jb/Uu99od4J6WZcaFVwQOBJyfvOSd5k2aCWxdFD0Ft4sPkx6npmXB2bv7TwuLS8onVuWHYYlpoc5qCRcEVthx3Ak8zgyBTge306mt53r5GY7lWv9www46EvuI9zsD51Fmb7lG8KDa+jS4X61EzqoK+hXgMdTKO48ulYCPpapZLVI4JsPY8jjLXKcA4zgSOwiS3mAG7gj6ee1Qg0XaK6sojuuozXdrTxi/laJX9v6IAaSW4QYN6KCW2IjuUaYOmstroTPlGperlLNfb7RRcZblDxZ5H9XJBnaalA7TLDTInhjRMDtHf3OB33+JHhgacNutFAqYv4WbkX9KnSYOW/J6Uq39Sz+GqnwDMcG8DZQMwwJz/gNAbHL+28y2cbDbjrebmz+36/sHY6mmyQr6QNRKTHbJPjsgxaRFGFLkld+Q++B08Bk/Bn2dpLRjXfCYvojb1F46BsFg=</latexit>

We can write , where

E = �X

a,b,c

⇡abc(�) logMabc �X

a,B

⇢aB(�, o) logOaB

<latexit sha1_base64="AJ05Rh5VhbPwPusGw/7j5U0uBZs=">AAACnHicfZFbaxQxGIaz46mOh271UpDgUqgyXWZqod4IZa0gyNIK7m6hWZZM9tvZ0JxIMuIyzE/zh3jtrf4HM7ProRb8IPDkzZt8yZvcCO58mn7tRDdu3rp9Z+tufO/+g4fb3Z1HY6dLy2DEtND2PKcOBFcw8twLODcWqMwFTPLLN8365BNYx7X66FcGppIWii84oz5Is+7kLX6N94kr5ayiSZ6wmhgeMGf1HnG8kPQ5EbrAw7X22zmoiV3qQINfvkSvnaetOOv20n7aFr4O2QZ6aFNns53OPplrVkpQngnq3EWWGj+tqPWcCahjUjowlF3SAi4CKirBTas2gRrvBmWOF9qGoTxu1b93VFQ6Sf0ywQEai2vJrWSe4Fy2E21UOKhxXe3lF6+mFVem9KDYutWiFNhr3ASK59wC82KFY3IC4eYWhuGIUwOWem1fVITakM7nOrykwCTBDf/PytUfa+B4N3SgzPIQA2ZLainz4T/jEHD2b5zXYXzQz172Dz4c9o4Hm6i30BP0DO2hDB2hY/QOnaERYugL+oa+ox/R0+gkeh8N19aos9nzGF2paPwTVFfL/g==</latexit>

If each of them are the accumulation of independent, additive increments

Assuming Gaussian Distributions: Language evolution is a dynamical

process, which must be slow in order for language to remain

comprehensible at any given moment.

Random Language Model

W = e�E<latexit sha1_base64="f0FdXArNQqbLAMugGruIIYE7quE=">AAACTXicfZFdSxwxFIYza606av3opTehiyCyLjMq6I0g1oI3pRa6ruisciZ7djeYjyHJiMuw/8Jb/Uu99od4J6WZcaFVwQOBJyfvOSd5k2aCWxdFD0Ft4sPkx6npmXB2bv7TwuLS8onVuWHYYlpoc5qCRcEVthx3Ak8zgyBTge306mt53r5GY7lWv9www46EvuI9zsD51Fmb7lG8KDa+jS4X61EzqoK+hXgMdTKO48ulYCPpapZLVI4JsPY8jjLXKcA4zgSOwiS3mAG7gj6ee1Qg0XaK6sojuuozXdrTxi/laJX9v6IAaSW4QYN6KCW2IjuUaYOmstroTPlGperlLNfb7RRcZblDxZ5H9XJBnaalA7TLDTInhjRMDtHf3OB33+JHhgacNutFAqYv4WbkX9KnSYOW/J6Uq39Sz+GqnwDMcG8DZQMwwJz/gNAbHL+28y2cbDbjrebmz+36/sHY6mmyQr6QNRKTHbJPjsgxaRFGFLkld+Q++B08Bk/Bn2dpLRjXfCYvojb1F46BsFg=</latexit>

We can write , where

E = �X

a,b,c

⇡abc(�) logMabc �X

a,B

⇢aB(�, o) logOaB

<latexit sha1_base64="AJ05Rh5VhbPwPusGw/7j5U0uBZs=">AAACnHicfZFbaxQxGIaz46mOh271UpDgUqgyXWZqod4IZa0gyNIK7m6hWZZM9tvZ0JxIMuIyzE/zh3jtrf4HM7ProRb8IPDkzZt8yZvcCO58mn7tRDdu3rp9Z+tufO/+g4fb3Z1HY6dLy2DEtND2PKcOBFcw8twLODcWqMwFTPLLN8365BNYx7X66FcGppIWii84oz5Is+7kLX6N94kr5ayiSZ6wmhgeMGf1HnG8kPQ5EbrAw7X22zmoiV3qQINfvkSvnaetOOv20n7aFr4O2QZ6aFNns53OPplrVkpQngnq3EWWGj+tqPWcCahjUjowlF3SAi4CKirBTas2gRrvBmWOF9qGoTxu1b93VFQ6Sf0ywQEai2vJrWSe4Fy2E21UOKhxXe3lF6+mFVem9KDYutWiFNhr3ASK59wC82KFY3IC4eYWhuGIUwOWem1fVITakM7nOrykwCTBDf/PytUfa+B4N3SgzPIQA2ZLainz4T/jEHD2b5zXYXzQz172Dz4c9o4Hm6i30BP0DO2hDB2hY/QOnaERYugL+oa+ox/R0+gkeh8N19aos9nzGF2paPwTVFfL/g==</latexit>

M,O ⇠ P(M,O) ⌘ Z�1G Je

�✏dsde�✏sss

<latexit sha1_base64="1haF9RJdATn5hlwSjCuxHZsuZ/4=">AAACl3icfZFda9RAFIZnUz9q/OjWXok3B5dCle2S1EK9a1HRIpRuxW2LnTVMJme3Q2cm6cykuIT8L/+KN97qz3CSLuha8EDgmTfvmZPzJi2ksC6KvneCpVu379xdvhfef/Dw0Up39fGxzUvDccRzmZvTlFmUQuPICSfxtDDIVCrxJL1407w/uUJjRa4/uVmBY8WmWkwEZ85LSffjQR8OgVqhgCrmztMUhhuN9hwoXpbiCj4n1fv6S7UZ1/AB0APFwgrpmzObZPWCYm1i66TbiwZRW3AT4jn0yLyGyWpnk2Y5LxVqxyWz9iyOCjeumHGCS6xDWlosGL9gUzzzqJlCO67a5WtY90oGk9z4Rzto1b87KqZss1kfPDQW25KdqbQPqWoPeaH9RY1rcZabvBpXQhelQ82vR01KCS6HJkvIhEHu5AxC+hb9lxs88FccFmiYy82LijIzVexr7TeZAu1Dw/+zCv3H6jlc9xMYN8LHAPycGcad/5WhDzj+N86bcLw1iF8Oto62e3uv51Evk6fkGdkgMdkhe2SfDMmIcPKN/CA/ya/gSbAbvAv2r61BZ96zRhYqOPoNn+DJJA==</latexit>

lognormal distribution

sd =1

N3

X

a,b,c

log2✓Ma,b,c

M

<latexit sha1_base64="jm8e0xUN1PE+arMx5W3sLIUz4nE=">AAACknicfZFdaxQxFIaz41cdv7bVO2+CS6HKuMxsBYsgVOuFF65WcNtCs10y2TOzofkYkoy4hPlV/pre6h8xM7uiteCBwJOT95yTvMkrwa1L04tedO36jZu3Nm7Hd+7eu/+gv7l1ZHVtGEyYFtqc5NSC4AomjjsBJ5UBKnMBx/n5QXt+/BWM5Vp9ccsKppKWihecURdSs/7Yzub4NSaFocxnjf945nebhthazjxN8oQ1ROjybEQEFG5nJRv/Pmo8yanx41BgeLlwT2f9QTpMu8BXIVvDAK3jcLbZe07mmtUSlGOCWnuapZWbemocZwKamNQWKsrOaQmnARWVYKe+e3eDt0NmjgttwlIOd9m/KzyVVlK3SHCAVmI7skuZJziX3UZXKjRqVZdnuWJv6rmqageKrUYVtcBO49ZGPOcGmBNLHJN3EG5uYBxafKrAUKfNM0+oKSX91oSXlJgkuOX/Sbn6Iw0cb4cJlBkebMBsQYPtLvxiHAzO/rXzKhyNhtnucPT5xWD/7drqDfQYPUE7KEMv0T56jw7RBDH0HV2gH+hn9Ch6Fb2JDlbSqLeueYguRfThF1/2yTM=</latexit>

ss =1

NT

X

a,B

log2✓Oa,B

O

<latexit sha1_base64="Y4y70hmlvLuaY47j6Ou5NlE2Iu4=">AAACi3icfZFdaxQxFIaz41cdrW710pvgUqiyLjNbwVIUSv3AG90K3bbQrMOZbGY2NB9DkhGXMP/IX+Od6I8xM7ugteCBwJNz3uTkvMkrwa1Lkh+96Nr1GzdvbdyO79zdvHe/v/XgxOraUDalWmhzloNlgis2ddwJdlYZBjIX7DS/eN3WT78wY7lWx25ZsZmEUvGCU3AhlfXf2cziV5gUBqhPG//xuCG2lpmH4WFDhC4/j4lghdtZKSarQuNJDsZPmoYYXi7ck6w/SEZJF/gqpGsYoHUcZVu9Z2SuaS2ZclSAtedpUrmZB+M4FayJSW1ZBfQCSnYeUIFkdua7gRu8HTJzXGgTlnK4y/59woO0EtxiiAO0EtuRXcp8iHPZbXSlwkWt6nIvV+zNPFdV7Ziiq1ZFLbDTuPUPz7lh1IkljskbFl5u2IdwxaRiBpw2Tz0BU0r42oRJSkyGuOX/Sbn6Iw0cb4cOQA0PNmC6gGC6C98XB4PTf+28CifjUbo7Gn96Pjg4XFu9gR6hx2gHpegFOkDv0RGaIoq+oe/oJ/oVbUa70X70ciWNeuszD9GliN7+Buy7xm0=</latexit>

M = 1/N2<latexit sha1_base64="j6is0PzLbv0e4UDGzB6Oouf5ndI=">AAACVHicfZFdSxwxFIYzo7Y6bXXVy94EF6GU7XZmK+iNINqL3lgtdFVwtsuZ7NndYD6GJCMuw/4Sb/UvCf6XXjQzLvgFPRB4cvKec5I3WS64dXF8H4Rz8wtv3i4uRe/ef1heaayunVhdGIZdpoU2ZxlYFFxh13En8Cw3CDITeJpdHFTnp5doLNfqt5vk2JMwUnzIGTif6jdW0gxMeTiluzT5+vNPp99oxu24Dvoakhk0ySyO+6vBl3SgWSFROSbA2vMkzl2vBOM4EziN0sJiDuwCRnjuUYFE2yvrm0/pps8M6FAbv5SjdfZpRQnSSnDjFvVQSWxNdiKzFs1kvdG58o0q1fNZbrjTK7nKC4eKPYwaFoI6TSsj6IAbZE5MaJR+R39zg4e+xVGOBpw2n8sUzEjC1dS/ZETTFq34f1KuHqWeo00/AZjh3gbKxmCAOf8PkTc4eWnnazjptJNv7c6vrebe/szqRfKRbJBPJCHbZI/8IMekSxgpyDW5IbfBXfA3nAsXHqRhMKtZJ88iXP4HjqewTQ==</latexit>

O = 1/T<latexit sha1_base64="iRqpGQGWFt4EqlCH0ti9R3W7PHg=">AAACUHicfZHPTxQxFMffrKI4ioIeuTRuSIxZlhkkwQsJQQ5cDJiwQMJMyJvu292G/hjbjmEz2b+Dq/5L3vxPvGFnWKNI4kuafvr67Xvtt0UphfNJ8iPqPHi48Ojx4pP46bOl5y+WV16eOFNZTgNupLFnBTqSQtPACy/prLSEqpB0Wlx+aPZPv5B1wuhjPy0pVzjWYiQ4+pDKswJtfThjOyzdOL5Y7ib9pA12H9I5dGEeRxcr0Xo2NLxSpD2X6Nx5mpQ+r9F6wSXN4qxyVCK/xDGdB9SoyOV1e+sZWwuZIRsZG4b2rM3+faJG5RT6SY8FaCSuJTdVRY8Vql2YUodCjepuLz96n9dCl5UnzW9bjSrJvGGNCWwoLHEvpyzO9inc3NLHUOKwJIve2Ld1hnas8GoWXjJmWY81/D+p0H+kgeO10AG5FcEGxidokfvwB3EwOP3XzvtwstlP3/U3P211d/fmVi/CKryGN5DCNuzCARzBADh8hmv4Ct+i79HP6KYT3Up/z/AK7kQn/gXciLB/</latexit>

Deep sparsity:

Surface Sparsity:

Random Language Model

M,O ⇠ P(M,O) ⌘ Z�1G Je

�✏dsde�✏sss

<latexit sha1_base64="1haF9RJdATn5hlwSjCuxHZsuZ/4=">AAACl3icfZFda9RAFIZnUz9q/OjWXok3B5dCle2S1EK9a1HRIpRuxW2LnTVMJme3Q2cm6cykuIT8L/+KN97qz3CSLuha8EDgmTfvmZPzJi2ksC6KvneCpVu379xdvhfef/Dw0Up39fGxzUvDccRzmZvTlFmUQuPICSfxtDDIVCrxJL1407w/uUJjRa4/uVmBY8WmWkwEZ85LSffjQR8OgVqhgCrmztMUhhuN9hwoXpbiCj4n1fv6S7UZ1/AB0APFwgrpmzObZPWCYm1i66TbiwZRW3AT4jn0yLyGyWpnk2Y5LxVqxyWz9iyOCjeumHGCS6xDWlosGL9gUzzzqJlCO67a5WtY90oGk9z4Rzto1b87KqZss1kfPDQW25KdqbQPqWoPeaH9RY1rcZabvBpXQhelQ82vR01KCS6HJkvIhEHu5AxC+hb9lxs88FccFmiYy82LijIzVexr7TeZAu1Dw/+zCv3H6jlc9xMYN8LHAPycGcad/5WhDzj+N86bcLw1iF8Oto62e3uv51Evk6fkGdkgMdkhe2SfDMmIcPKN/CA/ya/gSbAbvAv2r61BZ96zRhYqOPoNn+DJJA==</latexit>

sd =1

N3

X

a,b,c

log2✓Ma,b,c

M

<latexit sha1_base64="jm8e0xUN1PE+arMx5W3sLIUz4nE=">AAACknicfZFdaxQxFIaz41cdv7bVO2+CS6HKuMxsBYsgVOuFF65WcNtCs10y2TOzofkYkoy4hPlV/pre6h8xM7uiteCBwJOT95yTvMkrwa1L04tedO36jZu3Nm7Hd+7eu/+gv7l1ZHVtGEyYFtqc5NSC4AomjjsBJ5UBKnMBx/n5QXt+/BWM5Vp9ccsKppKWihecURdSs/7Yzub4NSaFocxnjf945nebhthazjxN8oQ1ROjybEQEFG5nJRv/Pmo8yanx41BgeLlwT2f9QTpMu8BXIVvDAK3jcLbZe07mmtUSlGOCWnuapZWbemocZwKamNQWKsrOaQmnARWVYKe+e3eDt0NmjgttwlIOd9m/KzyVVlK3SHCAVmI7skuZJziX3UZXKjRqVZdnuWJv6rmqageKrUYVtcBO49ZGPOcGmBNLHJN3EG5uYBxafKrAUKfNM0+oKSX91oSXlJgkuOX/Sbn6Iw0cb4cJlBkebMBsQYPtLvxiHAzO/rXzKhyNhtnucPT5xWD/7drqDfQYPUE7KEMv0T56jw7RBDH0HV2gH+hn9Ch6Fb2JDlbSqLeueYguRfThF1/2yTM=</latexit>

ss =1

NT

X

a,B

log2✓Oa,B

O

<latexit sha1_base64="Y4y70hmlvLuaY47j6Ou5NlE2Iu4=">AAACi3icfZFdaxQxFIaz41cdrW710pvgUqiyLjNbwVIUSv3AG90K3bbQrMOZbGY2NB9DkhGXMP/IX+Od6I8xM7ugteCBwJNz3uTkvMkrwa1Lkh+96Nr1GzdvbdyO79zdvHe/v/XgxOraUDalWmhzloNlgis2ddwJdlYZBjIX7DS/eN3WT78wY7lWx25ZsZmEUvGCU3AhlfXf2cziV5gUBqhPG//xuCG2lpmH4WFDhC4/j4lghdtZKSarQuNJDsZPmoYYXi7ck6w/SEZJF/gqpGsYoHUcZVu9Z2SuaS2ZclSAtedpUrmZB+M4FayJSW1ZBfQCSnYeUIFkdua7gRu8HTJzXGgTlnK4y/59woO0EtxiiAO0EtuRXcp8iHPZbXSlwkWt6nIvV+zNPFdV7Ziiq1ZFLbDTuPUPz7lh1IkljskbFl5u2IdwxaRiBpw2Tz0BU0r42oRJSkyGuOX/Sbn6Iw0cb4cOQA0PNmC6gGC6C98XB4PTf+28CifjUbo7Gn96Pjg4XFu9gR6hx2gHpegFOkDv0RGaIoq+oe/oJ/oVbUa70X70ciWNeuszD9GliN7+Buy7xm0=</latexit>

M = 1/N2<latexit sha1_base64="j6is0PzLbv0e4UDGzB6Oouf5ndI=">AAACVHicfZFdSxwxFIYzo7Y6bXXVy94EF6GU7XZmK+iNINqL3lgtdFVwtsuZ7NndYD6GJCMuw/4Sb/UvCf6XXjQzLvgFPRB4cvKec5I3WS64dXF8H4Rz8wtv3i4uRe/ef1heaayunVhdGIZdpoU2ZxlYFFxh13En8Cw3CDITeJpdHFTnp5doLNfqt5vk2JMwUnzIGTif6jdW0gxMeTiluzT5+vNPp99oxu24Dvoakhk0ySyO+6vBl3SgWSFROSbA2vMkzl2vBOM4EziN0sJiDuwCRnjuUYFE2yvrm0/pps8M6FAbv5SjdfZpRQnSSnDjFvVQSWxNdiKzFs1kvdG58o0q1fNZbrjTK7nKC4eKPYwaFoI6TSsj6IAbZE5MaJR+R39zg4e+xVGOBpw2n8sUzEjC1dS/ZETTFq34f1KuHqWeo00/AZjh3gbKxmCAOf8PkTc4eWnnazjptJNv7c6vrebe/szqRfKRbJBPJCHbZI/8IMekSxgpyDW5IbfBXfA3nAsXHqRhMKtZJ88iXP4HjqewTQ==</latexit>

O = 1/T<latexit sha1_base64="iRqpGQGWFt4EqlCH0ti9R3W7PHg=">AAACUHicfZHPTxQxFMffrKI4ioIeuTRuSIxZlhkkwQsJQQ5cDJiwQMJMyJvu292G/hjbjmEz2b+Dq/5L3vxPvGFnWKNI4kuafvr67Xvtt0UphfNJ8iPqPHi48Ojx4pP46bOl5y+WV16eOFNZTgNupLFnBTqSQtPACy/prLSEqpB0Wlx+aPZPv5B1wuhjPy0pVzjWYiQ4+pDKswJtfThjOyzdOL5Y7ib9pA12H9I5dGEeRxcr0Xo2NLxSpD2X6Nx5mpQ+r9F6wSXN4qxyVCK/xDGdB9SoyOV1e+sZWwuZIRsZG4b2rM3+faJG5RT6SY8FaCSuJTdVRY8Vql2YUodCjepuLz96n9dCl5UnzW9bjSrJvGGNCWwoLHEvpyzO9inc3NLHUOKwJIve2Ld1hnas8GoWXjJmWY81/D+p0H+kgeO10AG5FcEGxidokfvwB3EwOP3XzvtwstlP3/U3P211d/fmVi/CKryGN5DCNuzCARzBADh8hmv4Ct+i79HP6KYT3Up/z/AK7kQn/gXciLB/</latexit>

Deep sparsity:

Surface Sparsity:

This is actually the maximum-entropy distribution when the grammar averages and are constrained!sd

<latexit sha1_base64="WT/LAi42nPxAt1VvXqC/1s/upTk=">AAACTHicfZDLSiQxFIZTreNoOd6XboKNIENPU6XCzFLUhRtRwfaC1TSn0qfbYC5Fkhpsin4Kt/pK7n0PdyKYKhu8gQcCX07+c8mfZoJbF0UPQW1s/MfEz8mpcPrXzOzc/MLiidW5YdhiWmhzloJFwRW2HHcCzzKDIFOBp+nVTvl++h+N5Vodu0GGbQl9xXucgfOp8yQFU9hOd9iZr0fNqAr6FeIR1MkoDjsLwZ+kq1kuUTkmwNqLOMpcuwDjOBM4DJPcYgbsCvp44VGBRNsuqo2HdNVnurSnjT/K0Sr7vqIAaSW4ywb1UEpsRXYg0wZNZXXRmfKNStXHWa73r11wleUOFXsd1csFdZqWBtAuN8icGNAw2UW/ucF93+IgQwNOm99FAqYv4Xrof9KnSYOW/J2Uqzep53DVTwBmuLeBskswwJz3P/QGx5/t/Aon6814o7l+tFnf2h5ZPUmWyQpZIzH5S7bIHjkkLcKIJDfkltwF98Fj8BQ8v0prwahmiXyI2sQLz/qxDA==</latexit>

ss<latexit sha1_base64="2kKeJvvm5QdzL/GLBc+yKisSlCU=">AAACTHicfZBdaxNBFIZno8a4fjTRS28GQ0Ekht0o6GVovfCmGMGkkWwIZycn6dD5WGZmpWHJr+ht+5e89394Vwqd3QRsWvDAwDNn3vMxb5oJbl0U/QlqDx4+qj9uPAmfPnv+Yq/ZejmyOjcMh0wLbcYpWBRc4dBxJ3CcGQSZCjxOTw/L9+NfaCzX6odbZTiVsFR8wRk4n/qZpGAKO7PrWbMddaMq6H2It9Am2xjMWsH7ZK5ZLlE5JsDaSRxlblqAcZwJXIdJbjEDdgpLnHhUINFOi2rjNd33mTldaOOPcrTK3q4oQFoJ7qRDPZQSW5FdybRDU1lddKZ8o1K1O8stPk8LrrLcoWKbUYtcUKdpaQCdc4PMiRUNky/oNzd45Ft8y9CA0+ZdkYBZSjhb+58sadKhJf9PytU/qedw308AZri3gbITMMCc9z/0Bsd37bwPo143/tDtff/Y7h9srW6Q1+QNeUti8on0yVcyIEPCiCTn5IJcBr+Dv8FVcL2R1oJtzSuyE7X6DewpsRs=</latexit>

sd =N3

2✏d, ss =

NT

2✏s.

<latexit sha1_base64="iE2aCq1rrzQeXPUhBsld+ii/UMQ=">AAACinicfZFdaxQxFIaz41cdq2710pvgUhBZl5mtoCJC0QreqBW6baGzLmcyZ7ah+SLJiMsw/UP+Gi/VP2NmOmhrwQOBNyfPOSd5kxvBnU+SH4PoytVr12+s3Yxvrd++c3e4cW/f6coynDEttD3MwaHgCmeee4GHxiLIXOBBfvKmPT/4gtZxrfb8yuBcwlLxkjPwIbUYvs1ysLVbFM2rrLTA6g+ft5p6mqFxXASgaManpz3j/jB75xHXTBbDUTJJuqCXRdqLEeljd7ExeJIVmlUSlWcCnDtKE+PnNVjPmcAmziqHBtgJLPEoSAUS3bzu3tvQzZApaKltWMrTLnu+ogbpJPjjMQ2iRVyn3ErmY5rLbqONCo1a6uIsXz6f11yZyqNiZ6PKSlCvaWsfLbhF5sWKxtkOhptbfB9afDRowWv7uM7ALiV8bcJLljQb01b/D+XqLxp0vBkmALM82EDZMQS3ffi9OBic/mvnZbE/naRbk+mnp6Pt173Va+QBeUgekZQ8I9vkHdklM8LIN/Kd/CS/ovVoGr2IXp6h0aCvuU8uRLTzG2Bsxy0=</latexit>

Random Language Model

M,O ⇠ P(M,O) ⌘ Z�1G Je

�✏dsde�✏sss

<latexit sha1_base64="1haF9RJdATn5hlwSjCuxHZsuZ/4=">AAACl3icfZFda9RAFIZnUz9q/OjWXok3B5dCle2S1EK9a1HRIpRuxW2LnTVMJme3Q2cm6cykuIT8L/+KN97qz3CSLuha8EDgmTfvmZPzJi2ksC6KvneCpVu379xdvhfef/Dw0Up39fGxzUvDccRzmZvTlFmUQuPICSfxtDDIVCrxJL1407w/uUJjRa4/uVmBY8WmWkwEZ85LSffjQR8OgVqhgCrmztMUhhuN9hwoXpbiCj4n1fv6S7UZ1/AB0APFwgrpmzObZPWCYm1i66TbiwZRW3AT4jn0yLyGyWpnk2Y5LxVqxyWz9iyOCjeumHGCS6xDWlosGL9gUzzzqJlCO67a5WtY90oGk9z4Rzto1b87KqZss1kfPDQW25KdqbQPqWoPeaH9RY1rcZabvBpXQhelQ82vR01KCS6HJkvIhEHu5AxC+hb9lxs88FccFmiYy82LijIzVexr7TeZAu1Dw/+zCv3H6jlc9xMYN8LHAPycGcad/5WhDzj+N86bcLw1iF8Oto62e3uv51Evk6fkGdkgMdkhe2SfDMmIcPKN/CA/ya/gSbAbvAv2r61BZ96zRhYqOPoNn+DJJA==</latexit>

This is actually the maximum-entropy distribution when the grammar averages and are constrained!sd

<latexit sha1_base64="WT/LAi42nPxAt1VvXqC/1s/upTk=">AAACTHicfZDLSiQxFIZTreNoOd6XboKNIENPU6XCzFLUhRtRwfaC1TSn0qfbYC5Fkhpsin4Kt/pK7n0PdyKYKhu8gQcCX07+c8mfZoJbF0UPQW1s/MfEz8mpcPrXzOzc/MLiidW5YdhiWmhzloJFwRW2HHcCzzKDIFOBp+nVTvl++h+N5Vodu0GGbQl9xXucgfOp8yQFU9hOd9iZr0fNqAr6FeIR1MkoDjsLwZ+kq1kuUTkmwNqLOMpcuwDjOBM4DJPcYgbsCvp44VGBRNsuqo2HdNVnurSnjT/K0Sr7vqIAaSW4ywb1UEpsRXYg0wZNZXXRmfKNStXHWa73r11wleUOFXsd1csFdZqWBtAuN8icGNAw2UW/ucF93+IgQwNOm99FAqYv4Xrof9KnSYOW/J2Uqzep53DVTwBmuLeBskswwJz3P/QGx5/t/Aon6814o7l+tFnf2h5ZPUmWyQpZIzH5S7bIHjkkLcKIJDfkltwF98Fj8BQ8v0prwahmiXyI2sQLz/qxDA==</latexit>

ss<latexit sha1_base64="2kKeJvvm5QdzL/GLBc+yKisSlCU=">AAACTHicfZBdaxNBFIZno8a4fjTRS28GQ0Ekht0o6GVovfCmGMGkkWwIZycn6dD5WGZmpWHJr+ht+5e89394Vwqd3QRsWvDAwDNn3vMxb5oJbl0U/QlqDx4+qj9uPAmfPnv+Yq/ZejmyOjcMh0wLbcYpWBRc4dBxJ3CcGQSZCjxOTw/L9+NfaCzX6odbZTiVsFR8wRk4n/qZpGAKO7PrWbMddaMq6H2It9Am2xjMWsH7ZK5ZLlE5JsDaSRxlblqAcZwJXIdJbjEDdgpLnHhUINFOi2rjNd33mTldaOOPcrTK3q4oQFoJ7qRDPZQSW5FdybRDU1lddKZ8o1K1O8stPk8LrrLcoWKbUYtcUKdpaQCdc4PMiRUNky/oNzd45Ft8y9CA0+ZdkYBZSjhb+58sadKhJf9PytU/qedw308AZri3gbITMMCc9z/0Bsd37bwPo143/tDtff/Y7h9srW6Q1+QNeUti8on0yVcyIEPCiCTn5IJcBr+Dv8FVcL2R1oJtzSuyE7X6DewpsRs=</latexit>

sd =N3

2✏d, ss =

NT

2✏s.

<latexit sha1_base64="iE2aCq1rrzQeXPUhBsld+ii/UMQ=">AAACinicfZFdaxQxFIaz41cdq2710pvgUhBZl5mtoCJC0QreqBW6baGzLmcyZ7ah+SLJiMsw/UP+Gi/VP2NmOmhrwQOBNyfPOSd5kxvBnU+SH4PoytVr12+s3Yxvrd++c3e4cW/f6coynDEttD3MwaHgCmeee4GHxiLIXOBBfvKmPT/4gtZxrfb8yuBcwlLxkjPwIbUYvs1ysLVbFM2rrLTA6g+ft5p6mqFxXASgaManpz3j/jB75xHXTBbDUTJJuqCXRdqLEeljd7ExeJIVmlUSlWcCnDtKE+PnNVjPmcAmziqHBtgJLPEoSAUS3bzu3tvQzZApaKltWMrTLnu+ogbpJPjjMQ2iRVyn3ErmY5rLbqONCo1a6uIsXz6f11yZyqNiZ6PKSlCvaWsfLbhF5sWKxtkOhptbfB9afDRowWv7uM7ALiV8bcJLljQb01b/D+XqLxp0vBkmALM82EDZMQS3ffi9OBic/mvnZbE/naRbk+mnp6Pt173Va+QBeUgekZQ8I9vkHdklM8LIN/Kd/CS/ovVoGr2IXp6h0aCvuU8uRLTzG2Bsxy0=</latexit>

When , ,

which is the value corresponding to a completely uniform deep grammar. This grammar carries no information.

✏d ! 1<latexit sha1_base64="3/53H42/2bPvnAyZDPG5zmEHVqs=">AAACYnicfZFNaxRBEIZ7J37E8SO7yVGFxiUgsi4zMWCOQT14ESO4SSCzLDW9NbNN+mPorolZhj35a7zqr/HuD7FnsqAxYEHD09VvVXW/nVdKekqSn71o49btO3c378X3Hzx8tNUfbB97WzuBE2GVdac5eFTS4IQkKTytHILOFZ7k52/b85MLdF5a85mWFU41lEYWUgCF1Kz/NMPKSxVwzjMnywWBc/YLz6QpaDnrD5Nx0gW/CekahmwdR7NB72U2t6LWaEgo8P4sTSqaNuBICoWrOKs9ViDOocSzgAY0+mnTvWPFd0NmzgvrwjLEu+zfFQ1or4EWIx6glfiO/FLnI57rbmMrExq1quuzqDiYNtJUNaERV6OKWnGyvLWFz6VDQWrJ4+wdhps7/BBafKzQAVn3osnAlRouV+ElJc9GvOX/SaX5Iw0c74YJIJwMNnCxAAeCwq/EweD0XztvwvHeOH013vu0Pzx8s7Z6kz1mz9hzlrLX7JC9Z0dswgT7yr6x7+xH71cUR4No50oa9dY1O+xaRE9+A/v9ts0=</latexit>

sd ! 0<latexit sha1_base64="+ISDj+Ghwf9m9XtXOKi+wcKAvPA=">AAACXHicfZHPahRBEMZ7x6jJxOhGIR68NC4BkXWZiYIeg3rIRZKAmwQyy1LTWzvbpP8M1T3qMszTeNUH8uKzpGeyEGMgBQ2/rv6qqvvrvFTS+ST504vurd1/8HB9I958tPX4SX/76YmzFQkcC6ssneXgUEmDYy+9wrOSEHSu8DS/+NSen35DctKar35Z4kRDYeRcCvAhNe0/z3Kg2k1nDc9IFgsPRPY7T6b9QTJKuuC3IV3BgK3iaLrde5PNrKg0Gi8UOHeeJqWf1EBeCoVNnFUOSxAXUOB5QAMa3aTuXtDw3ZCZ8bmlsIznXfbfihq00+AXQx6glbiO3FLnQ57rbmNLExq1qpuz/PzDpJamrDwacTVqXinuLW8N4TNJKLxa8jj7jOHmhF9Ci8MSCbyl13UGVGj40YSXFDwb8pbvkkpzLQ0c74YJIEgGG7hYAIHw4T/iYHD6v5234WRvlL4d7R2/G+x/XFm9zl6wl+wVS9l7ts8O2BEbM8Ea9pP9Yr97f6O1aDPaupJGvVXNM3Yjop1L1HGz4Q==</latexit>

Random Language Model

M,O ⇠ P(M,O) ⌘ Z�1G Je

�✏dsde�✏sss

<latexit sha1_base64="1haF9RJdATn5hlwSjCuxHZsuZ/4=">AAACl3icfZFda9RAFIZnUz9q/OjWXok3B5dCle2S1EK9a1HRIpRuxW2LnTVMJme3Q2cm6cykuIT8L/+KN97qz3CSLuha8EDgmTfvmZPzJi2ksC6KvneCpVu379xdvhfef/Dw0Up39fGxzUvDccRzmZvTlFmUQuPICSfxtDDIVCrxJL1407w/uUJjRa4/uVmBY8WmWkwEZ85LSffjQR8OgVqhgCrmztMUhhuN9hwoXpbiCj4n1fv6S7UZ1/AB0APFwgrpmzObZPWCYm1i66TbiwZRW3AT4jn0yLyGyWpnk2Y5LxVqxyWz9iyOCjeumHGCS6xDWlosGL9gUzzzqJlCO67a5WtY90oGk9z4Rzto1b87KqZss1kfPDQW25KdqbQPqWoPeaH9RY1rcZabvBpXQhelQ82vR01KCS6HJkvIhEHu5AxC+hb9lxs88FccFmiYy82LijIzVexr7TeZAu1Dw/+zCv3H6jlc9xMYN8LHAPycGcad/5WhDzj+N86bcLw1iF8Oto62e3uv51Evk6fkGdkgMdkhe2SfDMmIcPKN/CA/ya/gSbAbvAv2r61BZ96zRhYqOPoNn+DJJA==</latexit>

This is actually the maximum-entropy distribution when the grammar averages and are constrained!sd

<latexit sha1_base64="WT/LAi42nPxAt1VvXqC/1s/upTk=">AAACTHicfZDLSiQxFIZTreNoOd6XboKNIENPU6XCzFLUhRtRwfaC1TSn0qfbYC5Fkhpsin4Kt/pK7n0PdyKYKhu8gQcCX07+c8mfZoJbF0UPQW1s/MfEz8mpcPrXzOzc/MLiidW5YdhiWmhzloJFwRW2HHcCzzKDIFOBp+nVTvl++h+N5Vodu0GGbQl9xXucgfOp8yQFU9hOd9iZr0fNqAr6FeIR1MkoDjsLwZ+kq1kuUTkmwNqLOMpcuwDjOBM4DJPcYgbsCvp44VGBRNsuqo2HdNVnurSnjT/K0Sr7vqIAaSW4ywb1UEpsRXYg0wZNZXXRmfKNStXHWa73r11wleUOFXsd1csFdZqWBtAuN8icGNAw2UW/ucF93+IgQwNOm99FAqYv4Xrof9KnSYOW/J2Uqzep53DVTwBmuLeBskswwJz3P/QGx5/t/Aon6814o7l+tFnf2h5ZPUmWyQpZIzH5S7bIHjkkLcKIJDfkltwF98Fj8BQ8v0prwahmiXyI2sQLz/qxDA==</latexit>

ss<latexit sha1_base64="2kKeJvvm5QdzL/GLBc+yKisSlCU=">AAACTHicfZBdaxNBFIZno8a4fjTRS28GQ0Ekht0o6GVovfCmGMGkkWwIZycn6dD5WGZmpWHJr+ht+5e89394Vwqd3QRsWvDAwDNn3vMxb5oJbl0U/QlqDx4+qj9uPAmfPnv+Yq/ZejmyOjcMh0wLbcYpWBRc4dBxJ3CcGQSZCjxOTw/L9+NfaCzX6odbZTiVsFR8wRk4n/qZpGAKO7PrWbMddaMq6H2It9Am2xjMWsH7ZK5ZLlE5JsDaSRxlblqAcZwJXIdJbjEDdgpLnHhUINFOi2rjNd33mTldaOOPcrTK3q4oQFoJ7qRDPZQSW5FdybRDU1lddKZ8o1K1O8stPk8LrrLcoWKbUYtcUKdpaQCdc4PMiRUNky/oNzd45Ft8y9CA0+ZdkYBZSjhb+58sadKhJf9PytU/qedw308AZri3gbITMMCc9z/0Bsd37bwPo143/tDtff/Y7h9srW6Q1+QNeUti8on0yVcyIEPCiCTn5IJcBr+Dv8FVcL2R1oJtzSuyE7X6DewpsRs=</latexit>

sd =N3

2✏d, ss =

NT

2✏s.

<latexit sha1_base64="iE2aCq1rrzQeXPUhBsld+ii/UMQ=">AAACinicfZFdaxQxFIaz41cdq2710pvgUhBZl5mtoCJC0QreqBW6baGzLmcyZ7ah+SLJiMsw/UP+Gi/VP2NmOmhrwQOBNyfPOSd5kxvBnU+SH4PoytVr12+s3Yxvrd++c3e4cW/f6coynDEttD3MwaHgCmeee4GHxiLIXOBBfvKmPT/4gtZxrfb8yuBcwlLxkjPwIbUYvs1ysLVbFM2rrLTA6g+ft5p6mqFxXASgaManpz3j/jB75xHXTBbDUTJJuqCXRdqLEeljd7ExeJIVmlUSlWcCnDtKE+PnNVjPmcAmziqHBtgJLPEoSAUS3bzu3tvQzZApaKltWMrTLnu+ogbpJPjjMQ2iRVyn3ErmY5rLbqONCo1a6uIsXz6f11yZyqNiZ6PKSlCvaWsfLbhF5sWKxtkOhptbfB9afDRowWv7uM7ALiV8bcJLljQb01b/D+XqLxp0vBkmALM82EDZMQS3ffi9OBic/mvnZbE/naRbk+mnp6Pt173Va+QBeUgekZQ8I9vkHdklM8LIN/Kd/CS/ovVoGr2IXp6h0aCvuU8uRLTzG2Bsxy0=</latexit>

As , ,

and the grammar carries more information.

✏d &<latexit sha1_base64="EUFHxAiRexPqVBcnKF74mbV6EHg=">AAACWHicfZHbShxBEIZrJwd1cnDVy9w0WYQQNsuMCvFSklx4E1TIquAsS01v7drYJ7p7YpZhnyW3ySMlT2PPuGCMYEHD19V/VXX/XVopfMiyP53kydNnz1dW19IXL1+9Xu9ubJ56UzlOQ26kceclepJC0zCIIOncOkJVSjorrz4352ffyXlh9LcwtzRSONNiKjiGmBp3twqyXsiIE1Z4QufM9bjbywZZG+wh5EvowTKOxxudD8XE8EqRDlyi9xd5ZsOoRhcEl7RIi8qTRX6FM7qIqFGRH9Xt7RdsO2YmbGpcXDqwNvtvRY3KKwyXfRahkfiW/FyVfVaqdmOsjo0a1f1ZYbo/qoW2VSDNb0dNK8mCYY0ZbCIc8SDnLC2+ULy5o6+xxZElh8G493WBbqbwxyK+ZMaKPmv4ManQd9LI6XacgNyJaAPjl+iQh/gXaTQ4/9/Oh3C6M8h3Bzsne72DT0urV+ENvIV3kMNHOIBDOIYhcJjDT/gFvzt/E0hWkrVbadJZ1mzBvUg2bwDMz7J9</latexit>

sd %<latexit sha1_base64="My5ECuSELfVH8EC3BecsWSlcBrc=">AAACV3icfZHfShtBFMZnV6vpWjVJL3szGAQpadi1Bb0U9aI3pQpGBTeEs5OTODh/lplZNSx5FW/1lfI07ewasCr0wMBvznznnJlvslxw6+J4HoRLyx9WVhsfo7VP6xubzVb73OrCMOwzLbS5zMCi4Ar7jjuBl7lBkJnAi+zmqDq/uEVjuVZnbprjQMJE8TFn4Hxq2GynGZjSDkczmioEY/TdsNmJe3Ed9D0kC+iQRZwMW8G3dKRZIVE5JsDaqyTO3aAE4zgTOIvSwmIO7AYmeOVRgUQ7KOvLz+i2z4zoWBu/lKN19t+KEqSV4K671EMlsTXZqcy6NJP1RufKN6pUr2e58f6g5CovHCr2PGpcCOo0rbygI26QOTGlUXqM/uYGf/kWv3M04LT5WqZgJhLuZ/4lE5p2acX/k3L1IvUcbfsJwAz3NlB2DQaY818ReYOTt3a+h/PdXvK9t3v6o3NwuLC6Qb6QLbJDErJHDshPckL6hJF78kAeyVMwD/6EK2HjWRoGi5rP5FWErb8SyrMT</latexit>

✏d<latexit sha1_base64="tV7316XxWaiEjLEJPjreLdqtbSo=">AAACTXicfZFNSxxBEIZ7VqNmkvgRj14aF0Fks8yooEdJPOQiKrgqcZalprd2bewvunvEZdh/4TX+Jc/5IbmFYM+44BdY0PB09VtV3W/nRnDnk+RP1Jia/jAzO/cx/vT5y/zC4tLXU6cLy7DDtND2PAeHgivseO4FnhuLIHOBZ/nVj+r87Bqt41qd+JHBroSh4gPOwIfUrwyN40KrXr+32EzaSR30LaQTaJJJHPWWom9ZX7NCovJMgHMXaWJ8twTrORM4jrPCoQF2BUO8CKhAouuW9ZXHdC1k+nSgbVjK0zr7vKIE6ST4yxYNUElcTW4k8xbNZb3RRoVGlerlLD/Y7ZZcmcKjYo+jBoWgXtPKAdrnFpkXIxpn+xhubvEgtDg0aMFru1FmYIcSbsbhJUOatWjF70m5epIGjtfCBGCWBxsouwQLzIcPiIPB6Ws738LpZjvdam8ebzf3vk+sniMrZJWsk5TskD3ykxyRDmFEkVvym9xF99Hf6F/0/1HaiCY1y+RFNGYfAKIPsXA=</latexit>

plays the role of temperature: random — hot, deterministic — cold

Random Language Model

M,O ⇠ P(M,O) ⌘ Z�1G Je

�✏dsde�✏sss

<latexit sha1_base64="1haF9RJdATn5hlwSjCuxHZsuZ/4=">AAACl3icfZFda9RAFIZnUz9q/OjWXok3B5dCle2S1EK9a1HRIpRuxW2LnTVMJme3Q2cm6cykuIT8L/+KN97qz3CSLuha8EDgmTfvmZPzJi2ksC6KvneCpVu379xdvhfef/Dw0Up39fGxzUvDccRzmZvTlFmUQuPICSfxtDDIVCrxJL1407w/uUJjRa4/uVmBY8WmWkwEZ85LSffjQR8OgVqhgCrmztMUhhuN9hwoXpbiCj4n1fv6S7UZ1/AB0APFwgrpmzObZPWCYm1i66TbiwZRW3AT4jn0yLyGyWpnk2Y5LxVqxyWz9iyOCjeumHGCS6xDWlosGL9gUzzzqJlCO67a5WtY90oGk9z4Rzto1b87KqZss1kfPDQW25KdqbQPqWoPeaH9RY1rcZabvBpXQhelQ82vR01KCS6HJkvIhEHu5AxC+hb9lxs88FccFmiYy82LijIzVexr7TeZAu1Dw/+zCv3H6jlc9xMYN8LHAPycGcad/5WhDzj+N86bcLw1iF8Oto62e3uv51Evk6fkGdkgMdkhe2SfDMmIcPKN/CA/ya/gSbAbvAv2r61BZ96zRhYqOPoNn+DJJA==</latexit>

This is actually the maximum-entropy distribution when the grammar averages and are constrained!sd

<latexit sha1_base64="WT/LAi42nPxAt1VvXqC/1s/upTk=">AAACTHicfZDLSiQxFIZTreNoOd6XboKNIENPU6XCzFLUhRtRwfaC1TSn0qfbYC5Fkhpsin4Kt/pK7n0PdyKYKhu8gQcCX07+c8mfZoJbF0UPQW1s/MfEz8mpcPrXzOzc/MLiidW5YdhiWmhzloJFwRW2HHcCzzKDIFOBp+nVTvl++h+N5Vodu0GGbQl9xXucgfOp8yQFU9hOd9iZr0fNqAr6FeIR1MkoDjsLwZ+kq1kuUTkmwNqLOMpcuwDjOBM4DJPcYgbsCvp44VGBRNsuqo2HdNVnurSnjT/K0Sr7vqIAaSW4ywb1UEpsRXYg0wZNZXXRmfKNStXHWa73r11wleUOFXsd1csFdZqWBtAuN8icGNAw2UW/ucF93+IgQwNOm99FAqYv4Xrof9KnSYOW/J2Uqzep53DVTwBmuLeBskswwJz3P/QGx5/t/Aon6814o7l+tFnf2h5ZPUmWyQpZIzH5S7bIHjkkLcKIJDfkltwF98Fj8BQ8v0prwahmiXyI2sQLz/qxDA==</latexit>

ss<latexit sha1_base64="2kKeJvvm5QdzL/GLBc+yKisSlCU=">AAACTHicfZBdaxNBFIZno8a4fjTRS28GQ0Ekht0o6GVovfCmGMGkkWwIZycn6dD5WGZmpWHJr+ht+5e89394Vwqd3QRsWvDAwDNn3vMxb5oJbl0U/QlqDx4+qj9uPAmfPnv+Yq/ZejmyOjcMh0wLbcYpWBRc4dBxJ3CcGQSZCjxOTw/L9+NfaCzX6odbZTiVsFR8wRk4n/qZpGAKO7PrWbMddaMq6H2It9Am2xjMWsH7ZK5ZLlE5JsDaSRxlblqAcZwJXIdJbjEDdgpLnHhUINFOi2rjNd33mTldaOOPcrTK3q4oQFoJ7qRDPZQSW5FdybRDU1lddKZ8o1K1O8stPk8LrrLcoWKbUYtcUKdpaQCdc4PMiRUNky/oNzd45Ft8y9CA0+ZdkYBZSjhb+58sadKhJf9PytU/qedw308AZri3gbITMMCc9z/0Bsd37bwPo143/tDtff/Y7h9srW6Q1+QNeUti8on0yVcyIEPCiCTn5IJcBr+Dv8FVcL2R1oJtzSuyE7X6DewpsRs=</latexit>

sd =N3

2✏d, ss =

NT

2✏s.

<latexit sha1_base64="iE2aCq1rrzQeXPUhBsld+ii/UMQ=">AAACinicfZFdaxQxFIaz41cdq2710pvgUhBZl5mtoCJC0QreqBW6baGzLmcyZ7ah+SLJiMsw/UP+Gi/VP2NmOmhrwQOBNyfPOSd5kxvBnU+SH4PoytVr12+s3Yxvrd++c3e4cW/f6coynDEttD3MwaHgCmeee4GHxiLIXOBBfvKmPT/4gtZxrfb8yuBcwlLxkjPwIbUYvs1ysLVbFM2rrLTA6g+ft5p6mqFxXASgaManpz3j/jB75xHXTBbDUTJJuqCXRdqLEeljd7ExeJIVmlUSlWcCnDtKE+PnNVjPmcAmziqHBtgJLPEoSAUS3bzu3tvQzZApaKltWMrTLnu+ogbpJPjjMQ2iRVyn3ErmY5rLbqONCo1a6uIsXz6f11yZyqNiZ6PKSlCvaWsfLbhF5sWKxtkOhptbfB9afDRowWv7uM7ALiV8bcJLljQb01b/D+XqLxp0vBkmALM82EDZMQS3ffi9OBic/mvnZbE/naRbk+mnp6Pt173Va+QBeUgekZQ8I9vkHdklM8LIN/Kd/CS/ovVoGr2IXp6h0aCvuU8uRLTzG2Bsxy0=</latexit>

✏d<latexit sha1_base64="tV7316XxWaiEjLEJPjreLdqtbSo=">AAACTXicfZFNSxxBEIZ7VqNmkvgRj14aF0Fks8yooEdJPOQiKrgqcZalprd2bewvunvEZdh/4TX+Jc/5IbmFYM+44BdY0PB09VtV3W/nRnDnk+RP1Jia/jAzO/cx/vT5y/zC4tLXU6cLy7DDtND2PAeHgivseO4FnhuLIHOBZ/nVj+r87Bqt41qd+JHBroSh4gPOwIfUrwyN40KrXr+32EzaSR30LaQTaJJJHPWWom9ZX7NCovJMgHMXaWJ8twTrORM4jrPCoQF2BUO8CKhAouuW9ZXHdC1k+nSgbVjK0zr7vKIE6ST4yxYNUElcTW4k8xbNZb3RRoVGlerlLD/Y7ZZcmcKjYo+jBoWgXtPKAdrnFpkXIxpn+xhubvEgtDg0aMFru1FmYIcSbsbhJUOatWjF70m5epIGjtfCBGCWBxsouwQLzIcPiIPB6Ws738LpZjvdam8ebzf3vk+sniMrZJWsk5TskD3ykxyRDmFEkVvym9xF99Hf6F/0/1HaiCY1y+RFNGYfAKIPsXA=</latexit>

plays the role of deep temperature: random — hot, deterministic — cold

Similarity,

✏s<latexit sha1_base64="tPCDKhc9cEOtHIRJFvaNfPhW/z4=">AAACTXicfZFNSxxBEIZ71hh11Ph1zKVxEUTWZUaF5CjqwUuIAVdFZ1lqemvXxv6iuydkGfZfeNW/5Nkf4k0kPeNCYoQUNDxd/VZV99u5Edz5JHmMGlMfpj/OzM7F8wuLn5aWV1bPnC4sww7TQtuLHBwKrrDjuRd4YSyCzAWe5zeH1fn5T7SOa3XqRwa7EoaKDzgDH1KXGRrHhVY911tuJu2kDvoe0gk0ySROeivRdtbXrJCoPBPg3FWaGN8twXrOBI7jrHBogN3AEK8CKpDoumV95THdCJk+HWgblvK0zv5dUYJ0Evx1iwaoJK4mN5J5i+ay3mijQqNK9XaWH3ztllyZwqNir6MGhaBe08oB2ucWmRcjGmdHGG5u8Vto8d2gBa/tVpmBHUr4NQ4vGdKsRSv+n5SrP9LA8UaYAMzyYANl12CB+fABcTA4/dfO93C200532zs/9pr7BxOrZ8lnsk42SUq+kH1yTE5IhzCiyC25I/fRQ/QUPUcvr9JGNKlZI2+iMfMbvi+xfw==</latexit>

controls information transmission at the surface, we call it the surface temperature.

Role of on language structure✏d<latexit sha1_base64="tV7316XxWaiEjLEJPjreLdqtbSo=">AAACTXicfZFNSxxBEIZ7VqNmkvgRj14aF0Fks8yooEdJPOQiKrgqcZalprd2bewvunvEZdh/4TX+Jc/5IbmFYM+44BdY0PB09VtV3W/nRnDnk+RP1Jia/jAzO/cx/vT5y/zC4tLXU6cLy7DDtND2PAeHgivseO4FnhuLIHOBZ/nVj+r87Bqt41qd+JHBroSh4gPOwIfUrwyN40KrXr+32EzaSR30LaQTaJJJHPWWom9ZX7NCovJMgHMXaWJ8twTrORM4jrPCoQF2BUO8CKhAouuW9ZXHdC1k+nSgbVjK0zr7vKIE6ST4yxYNUElcTW4k8xbNZb3RRoVGlerlLD/Y7ZZcmcKjYo+jBoWgXtPKAdrnFpkXIxpn+xhubvEgtDg0aMFru1FmYIcSbsbhJUOatWjF70m5epIGjtfCBGCWBxsouwQLzIcPiIPB6Ws738LpZjvdam8ebzf3vk+sniMrZJWsk5TskD3ykxyRDmFEkVvym9xF99Hf6F/0/1HaiCY1y+RFNGYfAKIPsXA=</latexit>

fix , andT = 27<latexit sha1_base64="kl+QYCV6/7PGM15stL6BFUhfGLw=">AAACSXicfZDLahsxFIY1Tto605vTLLMRMYFSXDPjFtxNwTRdZBOSQOwYPMackY8dNboMkqbUDH6HbpNX6hP0MboLWUUzMaSpoQcEn47+c9GfZoJbF0W/g9rG5pOnz+pb4fMXL1+9bmy/GVidG4Z9poU2wxQsCq6w77gTOMwMgkwFnqeXB+X7+Xc0lmt15hYZjiXMFZ9xBs6nBmf0M+10J41m1I6qoOsQr6BJVnEy2Q7eJ1PNconKMQHWjuIoc+MCjONM4DJMcosZsEuY48ijAol2XFTrLum+z0zpTBt/lKNV9u+KAqSV4C5a1EMpsRXZhUxbNJXVRWfKNypVj2e52adxwVWWO1TsftQsF9RpWv6eTrlB5sSChslX9JsbPPItjjM04LR5VyRg5hJ+LP1P5jRp0ZL/J+XqQeo53PcTgBnubaDsAgww580PvcHxv3auw6DTjj+0O6cfm70vK6vrZJfskbckJl3SI4fkhPQJI9/IT3JFroNfwZ/gJri9l9aCVc0OeRS1jTubQ65p</latexit>

✏s/(NT ) = 0.01<latexit sha1_base64="xRglV+nxY9BcUzzoUXxlObdXDQs=">AAACWXicfZHPbhMxEMa9Cy1haUtKuHGxGlVqqzTdLUhwQYqAAxdokZq2UjeKZp1JatX/ZHsRYZV34QpvhHgZvNtI0FbqSJZ/Hn/jsT8XRnDn0/R3FD94uLL6qPU4ebK2vvG0vfns1OnSMhwyLbQ9L8Ch4AqHnnuB58YiyELgWXH1vt4/+4rWca1O/NzgSMJM8Sln4ENq3H6eo3FcBHQHO59Pdt+m/TQbt7thaoLehWwJXbKM4/FmtJ9PNCslKs8EOHeRpcaPKrCeM4GLJC8dGmBXMMOLgAokulHVXH9Bt0NmQqfahqE8bbL/V1QgnQR/2aMBaolryM1l0aOFbBbaqHBQrbrZy0/fjCquTOlRsetW01JQr2ntBp1wi8yLOU3yDxhubvFTOOLIoAWv7V6Vg51J+LYIL5nRvEdrvk/K1T9p4GQ7dABmebCBskuwwHz4jCQYnN228y6cHvazl/3DL6+6g3dLq1vkBdkiOyQjr8mAfCTHZEgY+U5+kJ/kV/QnjuJWnFxL42hZ0yE3Iu78BXqlsTU=</latexit>

vary and . N<latexit sha1_base64="0Yi+J+JLSf22p62tKdDPdU68IY4=">AAACRHicfZDLSiQxFIZT6ngpZxwvSzfBRhDpaaocQZeiLtx4g2kVrEZOpU93B3MpkpTYFP0EbvWVfAffwZ24FVNlgzqCBwJfTv5zyZ9mglsXRQ/ByOjYj/GJyalw+uevmd+zc/MnVueGYZNpoc1ZChYFV9h03Ak8ywyCTAWeppc75fvpFRrLtfrn+hm2JHQV73AGzqeODy5ma1EjqoJ+hXgINTKMo4u54E/S1iyXqBwTYO15HGWuVYBxnAkchEluMQN2CV0896hAom0V1aYDuuwzbdrRxh/laJX9WFGAtBJcr049lBJbke3LtE5TWV10pnyjUvV5lutstgqustyhYm+jOrmgTtPy47TNDTIn+jRMdtFvbnDftzjM0IDTZrVIwHQlXA/8T7o0qdOSv5Ny9S71HC77CcAM9zZQ1gMDzHnfQ29w/L+dX+FkrRH/bawdr9e2todWT5JFskRWSEw2yBbZI0ekSRhBckNuyV1wHzwGT8Hzm3QkGNYskE8RvLwCJAKuSg==</latexit>

✏d<latexit sha1_base64="tV7316XxWaiEjLEJPjreLdqtbSo=">AAACTXicfZFNSxxBEIZ7VqNmkvgRj14aF0Fks8yooEdJPOQiKrgqcZalprd2bewvunvEZdh/4TX+Jc/5IbmFYM+44BdY0PB09VtV3W/nRnDnk+RP1Jia/jAzO/cx/vT5y/zC4tLXU6cLy7DDtND2PAeHgivseO4FnhuLIHOBZ/nVj+r87Bqt41qd+JHBroSh4gPOwIfUrwyN40KrXr+32EzaSR30LaQTaJJJHPWWom9ZX7NCovJMgHMXaWJ8twTrORM4jrPCoQF2BUO8CKhAouuW9ZXHdC1k+nSgbVjK0zr7vKIE6ST4yxYNUElcTW4k8xbNZb3RRoVGlerlLD/Y7ZZcmcKjYo+jBoWgXtPKAdrnFpkXIxpn+xhubvEgtDg0aMFru1FmYIcSbsbhJUOatWjF70m5epIGjtfCBGCWBxsouwQLzIcPiIPB6Ws738LpZjvdam8ebzf3vk+sniMrZJWsk5TskD3ykxyRDmFEkVvym9xF99Hf6F/0/1HaiCY1y+RFNGYfAKIPsXA=</latexit>

The information content of a grammar is naturally encoded by Shannon entropies:

Hs(G; k) =1

khlog 1/P (o1, o2, . . . , ok|G)i

<latexit sha1_base64="RvJVbjb8nEGCnc9dAxNoW9tlyOU=">AAACsnicfZFda9RAFIYnqR81fm3rpTcHl8JW1m2yFiyIUFSwN+IKblvsLGEyO8kOmY8wMxGXmF/plT/FOyfpQq0FDwSenHnnPTPvZJXg1sXxryDcunX7zt3te9H9Bw8fPR7s7J5aXRvK5lQLbc4zYpngis0dd4KdV4YRmQl2lpXvuvWzb8xYrtUXt67YQpJC8ZxT4nwrHciTtLHtCEviVpQI+PAayn14Azg3hDZJ25QtFkQVggEWuoAEDqAXZxnMsGC5G+k0Get0OsZL7ayn8seVGza8WLl9wKb3SAfDeBL3BTch2cAQbWqW7gQvvC2tJVOOCmLtRRJXbtEQ4zgVrI1wbVlFaEkKduFREcnsoulzaWHPd5aQa+M/5aDv/r2jIdJ2Rx2Dh05ie7JrmY0hk/2PrpQ36lTXZ7n8aNFwVdWOKXo5Kq8FOA1dzLDkhlEn1hDh98yf3LCP3uJTxQxx2jxvMDGFJN9bf5MC8Bg6/p+Uqyup52jPTyDUcB8D0BXxj+X8K0c+4OTfOG/C6XSSvJxMPx8Oj99uot5GT9EzNEIJeoWO0QmaoTmi6Cf6HYTBVngYfg1JSC+lYbDZ8wRdq1D8AUz90Ww=</latexit>

Hd(G; k) =1

khlog 1/P (�1,�2, . . . ,�k|G)i

<latexit sha1_base64="0aKU0UTSj4ErRZfMvEECNzTYKwE=">AAACwXicfZFda9swFIZl7yvzvpLucjeHhUI6vNTOBi2MQVkH681YBktbqEKQFdkRliUjyWPB9Q/d7X7JZNej6wo7IHh0/Oo98qukFNzYKPrp+Xfu3rv/YPAwePT4ydNnw9HOqVGVpmxBlVD6PCGGCS7ZwnIr2HmpGSkSwc6S/Lj9fvadacOV/Ga3JVsWJJM85ZRY11oNL09W9bqZ4ILYDSUCPr2DfA/eA041oXXc1HmDBZGZYICFyiCGfejESQJzLFhqJ9jwrCCrOOxhFuK1subPNr+8NseaZxu7B1h3lqvhOJpGXcFtiHsYo77mq5H32nnTqmDSUkGMuYij0i5roi2ngjUBrgwrCc1Jxi4cSlIws6y7mBrYdZ01pEq7JS103b9P1KQw7VVDcNBKTEdmWyQhJEW3UaV0Rq3q5iybHi5rLsvKMkmvRqWVAKugTR3WXDNqxRYC/JG5m2v22Vl8KZkmVulXNSbaZfWjcX+SAQ6h5f9JubyWOg523QRCNXcxAN0Q93bWPXrgAo7/jfM2nM6m8Zvp7Ovb8dGHPuoBeoFeogmK0QE6QidojhaIol/ewBt5O/6xz/3S11dS3+vPPEc3yq9/A+YK1u4=</latexit>

and

FIG. 2. Shannon entropy of random CFGs as functions of . (a) Block entropy of hidden configurations for indicated k and N. (b) Block entropy of observed strings; symbols as in (a). The constant value for surface entropy

depends on the surface temperature. Bars indicate 20th and 80th percentiles.

Role of on language structure✏d<latexit sha1_base64="tV7316XxWaiEjLEJPjreLdqtbSo=">AAACTXicfZFNSxxBEIZ7VqNmkvgRj14aF0Fks8yooEdJPOQiKrgqcZalprd2bewvunvEZdh/4TX+Jc/5IbmFYM+44BdY0PB09VtV3W/nRnDnk+RP1Jia/jAzO/cx/vT5y/zC4tLXU6cLy7DDtND2PAeHgivseO4FnhuLIHOBZ/nVj+r87Bqt41qd+JHBroSh4gPOwIfUrwyN40KrXr+32EzaSR30LaQTaJJJHPWWom9ZX7NCovJMgHMXaWJ8twTrORM4jrPCoQF2BUO8CKhAouuW9ZXHdC1k+nSgbVjK0zr7vKIE6ST4yxYNUElcTW4k8xbNZb3RRoVGlerlLD/Y7ZZcmcKjYo+jBoWgXtPKAdrnFpkXIxpn+xhubvEgtDg0aMFru1FmYIcSbsbhJUOatWjF70m5epIGjtfCBGCWBxsouwQLzIcPiIPB6Ws738LpZjvdam8ebzf3vk+sniMrZJWsk5TskD3ykxyRDmFEkVvym9xF99Hf6F/0/1HaiCY1y+RFNGYfAKIPsXA=</latexit>

✏d = ✏/N3<latexit sha1_base64="YI1RXDh3N8rq2Pj/znhKqnh5FHI=">AAACZ3icfVHbahRBEO0db3G8ZKMggi9lloDIus4kgr4IQX3wRY3gJoHMuvT01G6a9I3uGnEZ9t2v8VV/xU/wL+yZLGoMWNBw6vSpqq7TpVMyUJb96CUXLl66fGXtanrt+o2b6/2NW/vB1l7gWFhl/WHJAyppcEySFB46j1yXCg/Kk5ft/cEn9EFa84EWDieaz42cScEpUtP+ZkFSVQgFuiBVZCp4/juBx/D24860P8hGWRdwHuQrMGCr2Jtu9B4VlRW1RkNC8RCO8szRpOGepFC4TIs6oOPihM/xKELDNYZJ0y2zhK3IVDCzPh5D0LF/VzRcB83peAgRtJLQobDQ5RBK3SXWmdioVZ2dRbNnk0YaVxMacTpqVisgC603UEmPgtQC0uIVxpd7fBNbvHPoOVn/sCm4n2v+eRk3mUMxhBb/TyrNH2nE6VacwIWX0QYQx9xzQfFr0mhw/q+d58H+9ijfGW2/fzLYfbGyeo3dY5vsAcvZU7bLXrM9NmaCfWFf2Tf2vfczWU/uJHdPpUlvVXObnYnk/i9mR7dD</latexit>

Role of on language structure✏d<latexit sha1_base64="tV7316XxWaiEjLEJPjreLdqtbSo=">AAACTXicfZFNSxxBEIZ7VqNmkvgRj14aF0Fks8yooEdJPOQiKrgqcZalprd2bewvunvEZdh/4TX+Jc/5IbmFYM+44BdY0PB09VtV3W/nRnDnk+RP1Jia/jAzO/cx/vT5y/zC4tLXU6cLy7DDtND2PAeHgivseO4FnhuLIHOBZ/nVj+r87Bqt41qd+JHBroSh4gPOwIfUrwyN40KrXr+32EzaSR30LaQTaJJJHPWWom9ZX7NCovJMgHMXaWJ8twTrORM4jrPCoQF2BUO8CKhAouuW9ZXHdC1k+nSgbVjK0zr7vKIE6ST4yxYNUElcTW4k8xbNZb3RRoVGlerlLD/Y7ZZcmcKjYo+jBoWgXtPKAdrnFpkXIxpn+xhubvEgtDg0aMFru1FmYIcSbsbhJUOatWjF70m5epIGjtfCBGCWBxsouwQLzIcPiIPB6Ws738LpZjvdam8ebzf3vk+sniMrZJWsk5TskD3ykxyRDmFEkVvym9xF99Hf6F/0/1HaiCY1y+RFNGYfAKIPsXA=</latexit>

✏d & N3 log2 N<latexit sha1_base64="gU/S8foEc6vk6PSu0y8QZNhwGIk=">AAACYnicfZHfShwxFMazU1vtaOtuvbRC6CKUsl1mVsFeSvXCm1oLXRWcdTmTPTsG848kI12GverTeNs+Te99kGbGhdYKPRD45eQ75yRfciO480nyqxU9WXr6bHnleby69uLlervz6tTp0jIcMi20Pc/BoeAKh557gefGIshc4Fl+fVCfn92gdVyrr35mcCShUHzKGfiQGre3MjSOi4ATmhU+CCU9vtzJhC4uB8fjdjfpJ03Qx5AuoEsWcTLutN5nE81KicozAc5dpInxowqs50zgPM5KhwbYNRR4EVCBRDeqmnfM6XbITOhU27CUp03274oKpJPgr3o0QC1xDbmZzHs0l81GGxUa1aqHs/z0w6jiypQeFbsfNS0F9ZrWttAJt8i8mNE4O8Rwc4ufQovPBi14bd9VGdhCwrd5eElBsx6t+X9Srv5IA8fbYQIwy4MNlF2BBebDr8TB4PRfOx/D6aCf7vQHX3a7+x8XVq+QTfKGvCUp2SP75IickCFh5Du5JT/Iz9ZdFEedaONeGrUWNRvkQUSvfwNXWbXv</latexit>

Completely Random Regime:

Role of on language structure✏d<latexit sha1_base64="tV7316XxWaiEjLEJPjreLdqtbSo=">AAACTXicfZFNSxxBEIZ7VqNmkvgRj14aF0Fks8yooEdJPOQiKrgqcZalprd2bewvunvEZdh/4TX+Jc/5IbmFYM+44BdY0PB09VtV3W/nRnDnk+RP1Jia/jAzO/cx/vT5y/zC4tLXU6cLy7DDtND2PAeHgivseO4FnhuLIHOBZ/nVj+r87Bqt41qd+JHBroSh4gPOwIfUrwyN40KrXr+32EzaSR30LaQTaJJJHPWWom9ZX7NCovJMgHMXaWJ8twTrORM4jrPCoQF2BUO8CKhAouuW9ZXHdC1k+nSgbVjK0zr7vKIE6ST4yxYNUElcTW4k8xbNZb3RRoVGlerlLD/Y7ZZcmcKjYo+jBoWgXtPKAdrnFpkXIxpn+xhubvEgtDg0aMFru1FmYIcSbsbhJUOatWjF70m5epIGjtfCBGCWBxsouwQLzIcPiIPB6Ws738LpZjvdam8ebzf3vk+sniMrZJWsk5TskD3ykxyRDmFEkVvym9xF99Hf6F/0/1HaiCY1y+RFNGYfAKIPsXA=</latexit>

Deep Structure Regime: ✏d . N3 log2 N<latexit sha1_base64="6eJMf55hgFfzvkBrNMlbP498BlA=">AAACY3icfZHdShwxFMezY1vt2NZVe1dagotQyrrMrIX2UmwvelM/wFXBWZcz2bNjMF8kGeky7J1P4237Mn2Avkcz44K1ggcCv5z8zznJP7kR3Pkk+d2KFp48fba49DxefvHy1Up7de3Y6dIyHDAttD3NwaHgCgeee4GnxiLIXOBJfvmlPj+5Quu4Vkd+anAooVB8whn4kBq132VoHBcBxzQT6Jzjku6db2dCF+f9vVG7k/SSJuhDSOfQIfM4GK22trKxZqVE5ZkA587SxPhhBdZzJnAWZ6VDA+wSCjwLqECiG1bNQ2Z0M2TGdKJtWMrTJvtvRQXSSfAXXRqglriG3FTmXZrLZqONCo1q1f1ZfvJ5WHFlSo+K3Y6alIJ6TWtf6JhbZF5MaZx9xXBzi99Di32DFry2H6oMbCHhxyy8pKBZl9b8mJSrO2ngeDNMAGZ5sIGyC7DAfPiWOBic/m/nQzju99LtXv/wY2dnd271EnlDNsh7kpJPZId8IwdkQBi5JjfkJ/nV+hMtR2vR61tp1JrXrJN7Eb39C09ftmM=</latexit>

✏d = ✏⇤ ⇡ N3 log2 N<latexit sha1_base64="OrXwLK/pCQwHu4qzVkOjs9b8WSg=">AAACbXicfZHdahQxFMez02rr+NGtxSuLBJdiKdtlZivUm0JRL7xpreC2hc52OZM9Ow3NF0lGugz7Cj6Nt/oePoWvYGa6VGvBA4FfTv7nnOSf3AjufJL8bEULi/fuLy0/iB8+evxkpb369Njp0jIcMC20Pc3BoeAKB557gafGIshc4El++a4+P/mC1nGtPvupwaGEQvEJZ+BDatTezNA4LgKO925wi2ZgjNVX9PB8JxO6OO8fjtqdpJc0Qe9COocOmcfRaLW1nY01KyUqzwQ4d5Ymxg8rsJ4zgbM4Kx0aYJdQ4FlABRLdsGqeNKMbITOmE23DUp422b8rKpBOgr/o0gC1xDXkpjLv0lw2G21UaFSrbs/ykzfDiitTelTsetSkFNRrWjtEx9wi82JK4+w9hptbPAgtPhq04LXdqjKwhYSrWXhJQbMurfl/Uq7+SAPHG2ECMMuDDZRdgAXmwwfFweD0XzvvwnG/l+70+p9ed/bfzq1eJs/JS7JJUrJL9skHckQGhJGv5Bv5Tn60fkXPovXoxbU0as1r1sitiF79BkHKun0=</latexit>

Phase transition at

Zipf’s law and Order Parameter

FIG. 3. (a) Zipf plot of hidden symbols for N = 40. Here . (b) Order parameter Q2 , with bars indicating 20th and 80th percentile ranges over

grammars at each parameter value. Inset: same plot in log-log axes.

✏d = ✏/N3<latexit sha1_base64="YI1RXDh3N8rq2Pj/znhKqnh5FHI=">AAACZ3icfVHbahRBEO0db3G8ZKMggi9lloDIus4kgr4IQX3wRY3gJoHMuvT01G6a9I3uGnEZ9t2v8VV/xU/wL+yZLGoMWNBw6vSpqq7TpVMyUJb96CUXLl66fGXtanrt+o2b6/2NW/vB1l7gWFhl/WHJAyppcEySFB46j1yXCg/Kk5ft/cEn9EFa84EWDieaz42cScEpUtP+ZkFSVQgFuiBVZCp4/juBx/D24860P8hGWRdwHuQrMGCr2Jtu9B4VlRW1RkNC8RCO8szRpOGepFC4TIs6oOPihM/xKELDNYZJ0y2zhK3IVDCzPh5D0LF/VzRcB83peAgRtJLQobDQ5RBK3SXWmdioVZ2dRbNnk0YaVxMacTpqVisgC603UEmPgtQC0uIVxpd7fBNbvHPoOVn/sCm4n2v+eRk3mUMxhBb/TyrNH2nE6VacwIWX0QYQx9xzQfFr0mhw/q+d58H+9ijfGW2/fzLYfbGyeo3dY5vsAcvZU7bLXrM9NmaCfWFf2Tf2vfczWU/uJHdPpUlvVXObnYnk/i9mR7dD</latexit>

Theoretical Insight

The entropy of a RLM string (contains n sentences) derives from 3 aspects:

1. each sentence can be represented by a derivation tree with many different topologies

2. each derivation tree can host a variety of internal hidden variables

3. given the hidden variables, the observed symbols can themselves vary.

⇠ `k log 4<latexit sha1_base64="HUM/sA3Ks3HS2OLS7doznfZMxCc=">AAACVnicfZHfShwxFMYzY606tXVXL70JXYQi22XGCvVSbC+8ERVcFZxlOZM9O4bNnyHJqMuwj+Jt+0r6MqWZcUGt0AOBX06+c07yJSsEty6OH4Nw4d3i+6XllejD6sdPa632+rnVpWHYZ1poc5mBRcEV9h13Ai8LgyAzgRfZ5Ed9fnGDxnKtzty0wIGEXPExZ+B8athqp5ZLmqIQw0kqdE53h61O3IuboG8hmUOHzONk2A6+piPNSonKMQHWXiVx4QYVGMeZwFmUlhYLYBPI8cqjAol2UDV3n9EtnxnRsTZ+KUeb7MuKCqSV4K671EMtsQ3Zqcy6NJPNRhfKN6pVr2e58d6g4qooHSr2NGpcCuo0ra2gI26QOTGlUfoT/c0NHvkWxwUacNpsVymYXMLdzL8kp2mX1vw/KVfPUs/Rlp8AzHBvA2XXYIA5/xORNzj51863cL7TS771dk53O/sHc6uXySb5TL6QhHwn++SQnJA+YeSW3JNf5HfwEPwJF8OlJ2kYzGs2yKsIW38Bybyx6g==</latexit>

⇠ (2`� n) logN<latexit sha1_base64="JiYQY1d/IPlUiX8+/jsq72wXFj8=">AAACW3icfZHPbhMxEMadbaFlKTQF0QuXEVGlgtJoN0WCYwUceikUibSVulE060y2Vv1nZXsR0ZKX4QovxIF3qXcbqZRKjGTp5/E3M/bnvJTC+ST53YlWVu/dX1t/ED/cePR4s7v15MSZynIacSONPcvRkRSaRl54SWelJVS5pNP88n1zfvqVrBNGf/HzksYKCy1mgqMPqUl3O3NCwe4wIylhD/TLTJoCPk66vWSQtAF3IV1Cjy3jeLLV2cumhleKtOcSnTtPk9KPa7RecEmLOKsclcgvsaDzgBoVuXHdPmABOyEzhZmxYWkPbfbvihqVU+gv+hCgkbiW3FzlfchVuzGlDo0a1e1ZfvZ2XAtdVp40vx41qyR4A40fMBWWuJdziLMPFG5u6Si0+FSSRW/sqzpDWyj8tggvKSDrQ8P/kwp9Iw0c74QJyK0INgC/QIvch++Ig8Hpv3behZPhIN0fDD+/7h28W1q9zp6zF2yXpewNO2CH7JiNGGff2Q/2k/3q/IlWojjauJZGnWXNU3YromdXvJOxyw==</latexit>

⇠ ` log T<latexit sha1_base64="MwuSisaTkUacwEOowKDxCAQ5U7Q=">AAACVHicfZFdSxwxFIYzs/WjUz/WeulN6CKUsi4ztmAvF/XCG6mCq4KzLGeyZ8dgPoYkU1yG/SW91b8k+F+8MDMutCr0QODJyXvOSd5kheDWxfFjELY+LCwuLX+MPq2srq23Nz6fW10ahgOmhTaXGVgUXOHAcSfwsjAIMhN4kd0c1OcXv9FYrtWZmxY4lJArPuEMnE+N2uup5ZKmKEQqdE7PRu1O3IuboO8hmUOHzONktBHspGPNSonKMQHWXiVx4YYVGMeZwFmUlhYLYDeQ45VHBRLtsGpuPqPbPjOmE238Uo422X8rKpBWgrvuUg+1xDZkpzLr0kw2G10o36hWvZ7lJj+HFVdF6VCxl1GTUlCnaW0EHXODzIkpjdJD9Dc3eOxb/CrQgNPmW5WCySXczvxLcpp2ac3/k3L1V+o52vYTgBnubaDsGgww5/8h8gYnb+18D+e7veR7b/f0R6e/P7d6mWyRL+QrScge6ZMjckIGhJGS/CF35D54CJ7CVrjwIg2Dec0meRXh2jM0HrEs</latexit>

S ⇠ ` log�4N2T

�<latexit sha1_base64="rcIrXqet7YBBWhCbKCIeqRXilr8=">AAACZnicfZHNbhMxEMed5assUFIQ6oGLIapUUIh2QyV6rGgPXIAimrZSHSKvM7ux6o+VPYsarXLmabiWZ+EN+hj1biNBqcRIln4e/2fG/jsrlfSYJL870a3bd+7eW7kfP3j4aPVxd+3JobeVEzASVll3nHEPShoYoUQFx6UDrjMFR9npbnN+9B2cl9Yc4LyEseaFkbkUHENq0n3xlXmpKQOlKFO2YApy3Nz69G14wJwsZvhq0u0lg6QNehPSJfTIMvYna503bGpFpcGgUNz7kzQpcVxzh1IoWMSs8lByccoLOAlouAY/rtu3LOhGyExpbl1YBmmb/bui5tprjrM+DdBIfEt+rrM+zXS7saUJjRrV9VmYb49racoKwYirUXmlKFraWEOn0oFANacx24NwcwcfQ4vPJTiO1r2uGXeF5meL8JKCsj5t+H9Saf5IA8cbYQIXTgYbqJhxxwWGn4mDwem/dt6Ew+EgfTsYftnq7bxfWr1CnpOXZJOk5B3ZIR/IPhkRQX6Qn+Sc/OpcRKvRs2j9Shp1ljVPybWI6CXwPLcY</latexit>

Theoretical Insight

While the energy:

E ⇠ �`

sN3

2✏d� `

rNT

2✏s+ const

<latexit sha1_base64="skb4goOT2JKBWNpVJqzuQyPBl9c=">AAACoHicfZHbahsxEIbl7SndHuK0l70RMYHSOmbXKaSXoQfITZsE4iTUco1WnnVEdNhIsyVm2YfrY/QJetu+QbUbQ5MaOiD4NPpnRvqVFUp6TJIfnejO3Xv3H6w9jB89fvJ0vbvx7MTb0gkYCausO8u4ByUNjFCigrPCAdeZgtPs4n1zfvoNnJfWHOOigInmcyNzKTiG1LQ7/kiZl5puM1CK+UuHFcsdF9Xnrzt1NWRQeKmCcFbXq5Ljmwpf1/Q1ZQhXWAlrPNbTbi8ZJG3QVUiX0CPLOJxudLbZzIpSg0GhuPfjNClwUnGHUiioY1Z6KLi44HMYBzRcg59UrQs13QqZGc2tC8sgbbM3KyquveZ43qcBGolvyS901qeZbje2MKFRo7o9C/O3k0qaokQw4npUXiqKljam0pl0IFAtaMw+QLi5g0+hxUEBjqN1ryrG3Vzzqzq8ZE5Znzb8P6k0f6WB460wgQsngw1UnPNgPoY/jYPB6b92rsLJcJDuDIZHb3p775ZWr5EXZJO8JCnZJXtknxySERHkO/lJfpHf0Wa0Hx1ER9fSqLOseU5uRfTlD5to0Ek=</latexit>

Combining this with entropy:

The effective free energy reflects a competition between E and S.

F = E � logWtree � S<latexit sha1_base64="LphvQYaUmYHtgfugHGxU/u5f2xc=">AAACY3icfZHbahRBEIZ7x1OceNhE70QpXAIiu8tMFPRGCJ7wRozoZgOZZenprZ006cPQXSNZhrnzabzVl/EBfA97JgsaAxY0fF39V1X333mppKck+dmLLl2+cvXaxvV488bNW7f7W9sH3lZO4ERYZd1hzj0qaXBCkhQelg65zhVO85NX7fn0CzovrflMqxJnmhdGLqXgFFLz/oO38ALewAgyZQuYzuuM8JRqcohNM/o07w+ScdIFXIR0DQO2jv35Vm+ULayoNBoSint/lCYlzWruSAqFTZxVHksuTniBRwEN1+hndfeQBnZCZgFL68IyBF3274qaa685HQ8hQCvxHfmVzoeQ625jSxMatarzs2j5fFZLU1aERpyNWlYKyELrCyykQ0FqBXH2GsPNHb4PLT6U6DhZ97jOuCs0P23CSwrIhtDy/6TS/JEGjnfCBC6cDDaAOOaOCwrfEgeD03/tvAgHu+P0yXj349PB3su11RvsHnvIHrGUPWN77B3bZxMm2Ff2jX1nP3q/os1oO7p7Jo1665o77FxE938DeGi1fQ==</latexit>

S ⇠ ` log�4N2T

�<latexit sha1_base64="rcIrXqet7YBBWhCbKCIeqRXilr8=">AAACZnicfZHNbhMxEMed5assUFIQ6oGLIapUUIh2QyV6rGgPXIAimrZSHSKvM7ux6o+VPYsarXLmabiWZ+EN+hj1biNBqcRIln4e/2fG/jsrlfSYJL870a3bd+7eW7kfP3j4aPVxd+3JobeVEzASVll3nHEPShoYoUQFx6UDrjMFR9npbnN+9B2cl9Yc4LyEseaFkbkUHENq0n3xlXmpKQOlKFO2YApy3Nz69G14wJwsZvhq0u0lg6QNehPSJfTIMvYna503bGpFpcGgUNz7kzQpcVxzh1IoWMSs8lByccoLOAlouAY/rtu3LOhGyExpbl1YBmmb/bui5tprjrM+DdBIfEt+rrM+zXS7saUJjRrV9VmYb49racoKwYirUXmlKFraWEOn0oFANacx24NwcwcfQ4vPJTiO1r2uGXeF5meL8JKCsj5t+H9Saf5IA8cbYQIXTgYbqJhxxwWGn4mDwem/dt6Ew+EgfTsYftnq7bxfWr1CnpOXZJOk5B3ZIR/IPhkRQX6Qn+Sc/OpcRKvRs2j9Shp1ljVPybWI6CXwPLcY</latexit>

Theoretical Insight

S ⇠ ` log�4NT 2

�<latexit sha1_base64="bFCVQybzKDE/rpVeZJQFje1hJBI=">AAACZnicfZHNbhMxEMed5assUFIQ6oGLIapUUIh2QyV6rGgPXIAimrZSHSKvM7ux6o+VPYsarXLmabiWZ+EN+hj1biNBqcRIln4e/2fG/jsrlfSYJL870a3bd+7eW7kfP3j4aPVxd+3JobeVEzASVll3nHEPShoYoUQFx6UDrjMFR9npbnN+9B2cl9Yc4LyEseaFkbkUHENq0n3xlTIvNWWgFFO2YApy3Nz6dPBtyJwsZvhq0u0lg6QNehPSJfTIMvYna503bGpFpcGgUNz7kzQpcVxzh1IoWMSs8lByccoLOAlouAY/rtu3LOhGyExpbl1YBmmb/bui5tprjrM+DdBIfEt+rrM+zXS7saUJjRrV9VmYb49racoKwYirUXmlKFraWEOn0oFANacx24NwcwcfQ4vPJTiO1r2uGXeF5meL8JKCsj5t+H9Saf5IA8cbYQIXTgYbqJhxxwWGn4mDwem/dt6Ew+EgfTsYftnq7bxfWr1CnpOXZJOk5B3ZIR/IPhkRQX6Qn+Sc/OpcRKvRs2j9Shp1ljVPybWI6CXuFrcY</latexit>

E ⇠ �`

sN3

2✏d� `

rNT

2✏s+ const

<latexit sha1_base64="skb4goOT2JKBWNpVJqzuQyPBl9c=">AAACoHicfZHbahsxEIbl7SndHuK0l70RMYHSOmbXKaSXoQfITZsE4iTUco1WnnVEdNhIsyVm2YfrY/QJetu+QbUbQ5MaOiD4NPpnRvqVFUp6TJIfnejO3Xv3H6w9jB89fvJ0vbvx7MTb0gkYCausO8u4ByUNjFCigrPCAdeZgtPs4n1zfvoNnJfWHOOigInmcyNzKTiG1LQ7/kiZl5puM1CK+UuHFcsdF9Xnrzt1NWRQeKmCcFbXq5Ljmwpf1/Q1ZQhXWAlrPNbTbi8ZJG3QVUiX0CPLOJxudLbZzIpSg0GhuPfjNClwUnGHUiioY1Z6KLi44HMYBzRcg59UrQs13QqZGc2tC8sgbbM3KyquveZ43qcBGolvyS901qeZbje2MKFRo7o9C/O3k0qaokQw4npUXiqKljam0pl0IFAtaMw+QLi5g0+hxUEBjqN1ryrG3Vzzqzq8ZE5Znzb8P6k0f6WB460wgQsngw1UnPNgPoY/jYPB6b92rsLJcJDuDIZHb3p775ZWr5EXZJO8JCnZJXtknxySERHkO/lJfpHf0Wa0Hx1ER9fSqLOseU5uRfTlD5to0Ek=</latexit>

The effective free energy reflects a competition between E and S.

F = E � logWtree � S<latexit sha1_base64="LphvQYaUmYHtgfugHGxU/u5f2xc=">AAACY3icfZHbahRBEIZ7x1OceNhE70QpXAIiu8tMFPRGCJ7wRozoZgOZZenprZ006cPQXSNZhrnzabzVl/EBfA97JgsaAxY0fF39V1X333mppKck+dmLLl2+cvXaxvV488bNW7f7W9sH3lZO4ERYZd1hzj0qaXBCkhQelg65zhVO85NX7fn0CzovrflMqxJnmhdGLqXgFFLz/oO38ALewAgyZQuYzuuM8JRqcohNM/o07w+ScdIFXIR0DQO2jv35Vm+ULayoNBoSint/lCYlzWruSAqFTZxVHksuTniBRwEN1+hndfeQBnZCZgFL68IyBF3274qaa685HQ8hQCvxHfmVzoeQ625jSxMatarzs2j5fFZLU1aERpyNWlYKyELrCyykQ0FqBXH2GsPNHb4PLT6U6DhZ97jOuCs0P23CSwrIhtDy/6TS/JEGjnfCBC6cDDaAOOaOCwrfEgeD03/tvAgHu+P0yXj349PB3su11RvsHnvIHrGUPWN77B3bZxMm2Ff2jX1nP3q/os1oO7p7Jo1665o77FxE938DeGi1fQ==</latexit>

, .

At , the energetic fluctuations balance entropy.✏⇤ = N3/ log2 N<latexit sha1_base64="+esMzoIzPHzb01OrKg68YNd9HT4=">AAACW3icfZHfThQxFMa7AyqMqIsGb7hp3JAYsi4zi4nemBDwghsQExdImGVzpnt2aOi/tB3jZtyX8RZeyAvfhc6wCSKJJ2ny6+l3zmm/5kZw55PkdytaWHz0+MnScvx05dnzF+3Vl8dOl5bhgGmh7WkODgVXOPDcCzw1FkHmAk/yy736/OQ7Wse1+uanBocSCsUnnIEPqVH7dYbGcRFw89Ph+fZWJnRx3j8ctTtJL2mCPoR0Dh0yj6PRautdNtaslKg8E+DcWZoYP6zAes4EzuKsdGiAXUKBZwEVSHTDqnnAjG6EzJhOtA1Ledpk/66oQDoJ/qJLA9QS15CbyrxLc9lstFGhUa26P8tPPg4rrkzpUbHbUZNSUK9p7Qcdc4vMiymNs88Ybm7xILT4YtCC13azysAWEn7MwksKmnVpzf+TcnUnDRxvhAnALA82UHYBFpgP3xEHg9N/7XwIx/1eut3rf33f2dmdW71E1skb8pak5APZIfvkiAwIIz/JL3JFrlt/ooUojlZupVFrXvOK3Ito7QZo4bKp</latexit>

✏d ⌧ ✏⇤<latexit sha1_base64="67fOHwxw1FSEoXroZm72YytX1Yc=">AAACXnicfZFdSxtBFIYnW7W6ak3am4o3g0EQScNuWmgvxfbCG1GhUcEN4ezkJA7OFzOzpWEJ/TXetr+nd/0pzq7BjwoeGHjOmXfmzHknN4I7nyR/G9GrhcWl18sr8era+puNZuvtmdOFZdhnWmh7kYNDwRX2PfcCL4xFkLnA8/z6a7V//gOt41p991ODAwkTxcecgQ+lYXMzQ+O4CDiimRD0Pt0bNttJN6mDPod0Dm0yj5Nhq/EhG2lWSFSeCXDuMk2MH5RgPWcCZ3FWODTArmGClwEVSHSDsp5hRndCZUTH2oalPK2rj0+UIJ0Ef9WhASqJq8lNZd6huawTbVS4qFI97eXHXwYlV6bwqNhdq3EhqNe0soSOuEXmxZTG2TcML7d4FK44NmjBa7tXZmAnEn7OwiQTmnVoxS9JuXqQBo53QgdglgcbKLsCC8yHH4mDwen/dj6Hs143/djtnX5q7x/MrV4mW2Sb7JKUfCb75JCckD5h5Be5Ib/Jn8a/aClajzbupFFjfuYdeRLR+1tLz7SN</latexit>

✏d � ✏⇤<latexit sha1_base64="p49M5cedVs725x8F4EFXu5I5knw=">AAACXnicfZFdSxwxFIazo7V21LrWG4s3wUUosl1mbKG9lNYLb6QKXRWcZTmTPTsG80WSKV2GxV/TW/t7vPOnNDMuflTwQOA5J29yct7kRnDnk+SmFc3Nv1p4vfgmXlpeebvaXnt34nRpGfaZFtqe5eBQcIV9z73AM2MRZC7wNL/8Xu+f/kLruFY//cTgQEKh+Jgz8KE0bL/P0DguAo5oVhT0Pt0ZtjtJL2mCPod0Bh0yi6PhWutjNtKslKg8E+DceZoYP6jAes4ETuOsdGiAXUKB5wEVSHSDqplhSrdDZUTH2oalPG2qj09UIJ0Ef9GlAWqJa8hNZN6luWwSbVS4qFY97eXHXwcVV6b0qNhdq3EpqNe0toSOuEXmxYTG2T6Gl1s8DFf8MGjBa7tTZWALCb+nYZJgUJfW/JKUqwdp4Hg7dABmebCBsguwwHz4kTgYnP5v53M42e2ln3q7x587e99mVi+STbJFPpCUfCF75IAckT5h5Ir8Idfkb+s2WohWotU7adSanVknTyLa+Ac4nLSD</latexit>

the energy is unimportant, uninformative grammar.

the entropy is less important, emergence of structure.

RLM & Learning human languages

•Initially, the child does not know the rules of the grammar, so it begins with some small number of hidden symbols and assigns uniform values to the weights M and O .

• To learn is to increase the likelihood of the grammar by adjusting the weights and adding new hidden symbols.

• As weights are driven away from uniform values, the temperatures and decrease.

• Eventually, the transition to deep structure is encountered, and the grammar begins to carry information.

✏d<latexit sha1_base64="tV7316XxWaiEjLEJPjreLdqtbSo=">AAACTXicfZFNSxxBEIZ7VqNmkvgRj14aF0Fks8yooEdJPOQiKrgqcZalprd2bewvunvEZdh/4TX+Jc/5IbmFYM+44BdY0PB09VtV3W/nRnDnk+RP1Jia/jAzO/cx/vT5y/zC4tLXU6cLy7DDtND2PAeHgivseO4FnhuLIHOBZ/nVj+r87Bqt41qd+JHBroSh4gPOwIfUrwyN40KrXr+32EzaSR30LaQTaJJJHPWWom9ZX7NCovJMgHMXaWJ8twTrORM4jrPCoQF2BUO8CKhAouuW9ZXHdC1k+nSgbVjK0zr7vKIE6ST4yxYNUElcTW4k8xbNZb3RRoVGlerlLD/Y7ZZcmcKjYo+jBoWgXtPKAdrnFpkXIxpn+xhubvEgtDg0aMFru1FmYIcSbsbhJUOatWjF70m5epIGjtfCBGCWBxsouwQLzIcPiIPB6Ws738LpZjvdam8ebzf3vk+sniMrZJWsk5TskD3ykxyRDmFEkVvym9xF99Hf6F/0/1HaiCY1y+RFNGYfAKIPsXA=</latexit>

✏s<latexit sha1_base64="tPCDKhc9cEOtHIRJFvaNfPhW/z4=">AAACTXicfZFNSxxBEIZ71hh11Ph1zKVxEUTWZUaF5CjqwUuIAVdFZ1lqemvXxv6iuydkGfZfeNW/5Nkf4k0kPeNCYoQUNDxd/VZV99u5Edz5JHmMGlMfpj/OzM7F8wuLn5aWV1bPnC4sww7TQtuLHBwKrrDjuRd4YSyCzAWe5zeH1fn5T7SOa3XqRwa7EoaKDzgDH1KXGRrHhVY911tuJu2kDvoe0gk0ySROeivRdtbXrJCoPBPg3FWaGN8twXrOBI7jrHBogN3AEK8CKpDoumV95THdCJk+HWgblvK0zv5dUYJ0Evx1iwaoJK4mN5J5i+ay3mijQqNK9XaWH3ztllyZwqNir6MGhaBe08oB2ucWmRcjGmdHGG5u8Vto8d2gBa/tVpmBHUr4NQ4vGdKsRSv+n5SrP9LA8UaYAMzyYANl12CB+fABcTA4/dfO93C200532zs/9pr7BxOrZ8lnsk42SUq+kH1yTE5IhzCiyC25I/fRQ/QUPUcvr9JGNKlZI2+iMfMbvi+xfw==</latexit>

RLM & Learning human languages

RLM Response:

if there are indeed many parameters to be set, these do not all need to be innate: the child only needs the basic structure of a WCFG, and the rest is emergent.

• The richness of the discovered structure has been used as criticism of the approach: if the child needs to set many parameters, then do these all need to be innate? This would be a heavy evolutionary burden, and a challenge to efficient learning.

“…Well, this collection, this mass of schematism, innate organizing principles, which guides our social and intellectual and individual behavior, that’s what I

mean to refer to by the concept of human nature.”

—Noam Chomsky

Michel Foucault

“… It is true that I mistrust the notion of human nature a little, and for the following reason: I believe that of the concepts or notions which a science can use, not all have the same degree of elaboration, and that in general they have neither the same function nor the same type of possible use in scientific discourse.

Let’s take the example of biology. You will find concepts with a classifying function, concepts with a differentiating function, and concepts with an analytical function… The notion of life played this role to some extent in biology during a certain period.

In the seventeenth and eighteenth centuries, the notion of life was hardly used in studying nature: one classified natural beings, whether living or non-living, in a vast hierarchical tableau which went from minerals to man; the break between the minerals and the plants or animals was relatively undecided … At the end of the eighteenth century, the description and analysis of these natural beings showed … Can one say that research into life has finally constituted itself in biological science? Has the concept of life been responsible for the organization of biological knowledge? I don’t think so. … I would say that the notion of life is not a scientific concept; it has been an epistemological indicator of which the classifying, delimiting and other functions had an effect on scientific discussions, and not on what they were talking about.”

Part II: Semantics

• concepts or notions are not all have the same degree of elaboration.• semantics: only had an effect on discussion/description, and not on what they

were talking about.

World Color Survey

Part II: Semantics

Video: the surprising pattern behind color names around the world

Part II: Semantics

• concepts or notions are not all have the same degree of elaboration.• semantics: only had an effect on discussion/description, and not on what they

were talking about.• the similarity/coincidence of human languages mentioned by Chomsky is just

universal consequences of discussion/description itself (not human nature).

What is this universal principle?

Communication Model

Fig. 1. (A) Shannon’s communication model. (B) Color communication example, where U is a set of colors, shown for simplicity along a single dimension. A specific meaning m is drawn from p(m). The speaker communicates m by uttering the word

“blue,” and the listener interprets blue as meaning hat m.

Communication Model

Information Bottleneck (IB)

Iq(M ;W ) =X

m,w

p(m)q(w|m) logq(w|m)

q(w)<latexit sha1_base64="AYgO4FopYnV2FbtZZHH9qVhRj0I=">AAACfHicfZHdbtMwFMfd8DXCVzcud2NRJrXQVck2bUho0gRcwMXEkOg6aakixz3JrPlrtsOoQh6Fp+EWHoCXQThpJRiTOJLln4//x8f+O9OcWRdFPzvBjZu3bt9ZuRveu//g4aPu6tqxVaWhMKaKK3OSEQucSRg75jicaANEZBwm2fnrZn/yCYxlSn50cw1TQQrJckaJ86m0u/cuvegfvsSTAd7HiS1FWonhZa37YnDRv/wiBglXRZIbQqvFum7mQZ12e9EoagNfh3gJPbSMo3S1s5nMFC0FSEc5sfY0jrSbVsQ4RjnUYVJa0ISekwJOPUoiwE6r9oU13vCZGc6V8UM63Gb/rqiIsIK4syH20EhsS3YusiHORLtQWvqDGtXVXi5/Ma2Y1KUDSRet8pJjp3BjGJ4xA9TxOQ6TN+BvbuDQH/FegyFOmWdVQkwhyOfav6TAyRA3/D8pk3+knsMN34FQw7wNmJ4R77Tz/xV6g+N/7bwOx1ujeHu09WGnd/BqafUKWkdPUB/FaA8doLfoCI0RRV/RN/Qd/ej8Cp4Gz4PNhTToLGseoysR7P4GePG/fA==</latexit>

Complexity - Mutual Information

Complexity is non-negative.

Information Bottleneck (IB)

Accuracy (distortion) - Expected Kullback–Leibler Divergence

Eq

hD[M |M ]

i= I(M ;U)� Iq(W ;U)

<latexit sha1_base64="qOF6udxfAGjkBwXytJZrgvWwFk8=">AAACgHicfZFda9swFIYV76vzvtL2cjdioZCNNLW7QcvKoGwdrBdhHSxNITZBVo4dUVlypeOy4PnH7Nfsdrvcv5nsBrausAOCR0fv0dF5lRRSWAyCXx3v1u07d++t3fcfPHz0+El3fePU6tJwGHMttTlLmAUpFIxRoISzwgDLEwmT5Pxdcz65BGOFVp9xWUCcs0yJVHCGLjXrHkQ5w0WS0Pez6qKmkYQUp0fT0ddowbAa1XFkRLbAmL6hx/3Rwfj59nEj7E8czrq9YBi0QW9CuIIeWcXJbL2zHc01L3NQyCWzdhoGBcYVMyi4hNqPSgsF4+csg6lDxXKwcdVOWdMtl5nTVBu3FNI2+3dFxXLbDDOgDhqJbcku82RAk7zd6EK5ixrV9V6Y7seVUEWJoPhVq7SUFDVtTKNzYYCjXFI/OgL3cgMjd8XHAgxDbV5UETNZzr7UbpKMRgPa8P+kQv2ROva3XAfGjXA2UL5ghnF0f+Y7g8N/7bwJp7vD8OVw99Or3uHbldVr5Cl5RvokJHvkkHwgJ2RMOPlGvpMf5KfneX1vxwuvpF5nVbNJroX3+jen2r/T</latexit>

Independent of q(w|m)<latexit sha1_base64="nku2r6KboKe29GR2nhTLAdIzDtU=">AAACSXicfZDLThsxFIY9odyGe7vsxmqEBChEM4AES0RZdIMKUhOQmAidcU6CwZfB9rSNhrwDW3glnoDH6K7qCs8QqVwkjmTp8/F/Lv7TTHDroughqI19GJ+YnJoOZ2bn5hcWlz62rc4NwxbTQpuTFCwKrrDluBN4khkEmQo8Ti+/lu/HP9FYrtUPN8iwI6GveI8zcD7Vvlr5dS1XzxbrUTOqgr6FeAR1MorDs6VgPelqlktUjgmw9jSOMtcpwDjOBA7DJLeYAbuEPp56VCDRdopq3SFd9pku7Wnjj3K0yj6vKEBaCe68QT2UEluRHci0QVNZXXSmfKNS9XKW6+10Cq6y3KFiT6N6uaBO0/L3tMsNMicGNEz20W9u8MC3+J6hAafNWpGA6Uv4PfQ/6dOkQUt+T8rVf6nncNlPAGa4t4GyczDAnDc/9AbHr+18C+2NZrzZ3Djaqu/ujayeIp/JF7JCYrJNdsk3ckhahJELckNuyV1wH/wJ/gb/nqS1YFTzibyI2tgjULOvUQ==</latexit>

Information Bottleneck (IB)

IB - an optimal trade-off between these two competing objectives

F� [q(w|m)] = Iq(M ;W )� �Iq(W ;U)<latexit sha1_base64="2MNsP35/G8wxvdjdCAED4SGZNf8=">AAACdXicfVFdaxNBFJ2sH63rR1N9FGEwVhJJ090qKBShqIg+FCuYppBdlruTm3TofGxnZm3Dmr/hr/FV/4O/xFdntwGtBS8MnHvm3HvnnskLwa2Lop+t4MrVa9dXVm+EN2/dvrPWXr97YHVpGA6ZFtoc5mBRcIVDx53Aw8IgyFzgKD9+Xd+PPqOxXKtPbl5gKmGm+JQzcJ7K2lEiwR0xEPRtluToYHzSPf0ie+nL99lJd29n1NtsaFqno51hL2t3okHUBL0M4iXokGXsZ+utzWSiWSlROSbA2nEcFS6twDjOBC7CpLRYADuGGY49VCDRplWz2oJueGZCp9r4oxxt2L8rKpC23qBPPagltkF2LvM+zWWT6EL5RrXq4iw3fZFWXBWlQ8XOR01LQZ2mtVN0wg0yJ+Y0TN6gf7nBPd/iQ4EGnDZPqgTMTMLZwm8yo0mf1vh/Uq7+SD0ON/wEYIZ7Gyg7AgPM+Y8KvcHxv3ZeBgfbg/jpYPvjs87uq6XVq+Q+eUi6JCbPyS55R/bJkDDylXwj38mP1q/gQfAoeHwuDVrLmnvkQgRbvwHxwLvw</latexit>

Complexity Accuracy

Color-naming in different language

Fig. 2. (Upper) The WCS stimulus palette. Columns correspond to equally spaced Munsell hues. Rows correspond to equally spaced lightness values. Each stimulus

is at the maximum available saturation for that hue/lightness combination. (Lower) These colors are irregularly distributed in 3D CIELAB color space.

Color-naming Bifurcation

Color-naming in different language

Fig. 3. Color-naming systems across languages (blue circles) achieve near-optimal compression. The theoretical limit is defined by the IB curve

(black). A total of 93% of the languages achieve better trade-offs than any of their hypothetical variants (gray circles).

Color-naming in different language

Fig. 4. Similarity between color-naming distributions of languages. Each color category is represented by the centroid color of the category.

Color-naming in different language

Color-naming Bifurcation

Fig. 5. Bifurcations of the IB color categories. The y axis shows the relative accuracy of each category w (defined in Materials and Methods). Colors correspond to centroids and width is proportional to the weight of each category, i.e., q(w).

IB & Emergence of Semantics

• We have shown that color-naming systems across languages achieve near-optimally efficient compression, as predicted by the IB principle.

• Apart from the yellow discrepancy, the successive refinement of the IB categories at critical points roughly recapitulates Berlin and Kay’s evolutionary sequence.

• The IB categories also evolve between phase transitions and new categories tend to appear gradually.

• The generality of the principles we invoke suggests that a drive for information-theoretic efficiency may not be unique to color naming.

Semantics: information-theoretic efficient description?

“…Well, it seems to me that the notion of human nature is of the same type. It was not by studying human nature that linguists discovered the laws of consonant mutation, or Freud the principles of the analysis of

dreams, or cultural anthropologists the structure of myths. In the history of knowledge, the notion of human nature seems to me mainly to have

played the role of an epistemological indicator to designate certain types of discourse in relation to or in opposition to theology or biology or

history. I would find it difficult to see in this a scientific concept.”

—Michel Foucault

My Naive and Error-Prone Interpretation:

The origin of language ~ nature v.s. nurture.

The emergence of syntax ~ genes / connectomes / hereditary factors are important

to support CFG + environmental variables also have impact

The emergence of semantics ~ environmental variables

+ information-theoretic optimality