Structure and function carbohydrates SubtopicsA few types of monosaccharide units can be joined to form a large variety of oligosaccharides and polysaccharides. The most abundant oligosaccharides

1 | P a g e

Structure and function carbohydrates

Subtopics

Definition and classification of carbohydrates

Monosaccharides: classification, physical and chemical properties

Disaccharides: maltose, lactose, sucrose,

Oligosaccharides and their nomenclatures

Polysaccharides: starch, glycogen and cellulose

Glycoproteins: simple glycoproteins, proteoglycans and mucins

2 | P a g e

Introduction to carbohydrates

Overview

Carbohydrates or saccharides (Greek: sakcharon, sugars) are polyhydroxy aldehydes or ketones, or

substances that yield such compounds upon hydrolysis. They are carbon-based biomolecules containing

an aldehyde or ketone functional group and two or more hydroxyl groups. Carbohydrates are the most

abundant class of biomolecules on Earth. The name carbohydrate, which literally means “carbon

hydrate,” stems from the empirical formula for many carbohydrates, which is roughly (C∙H2O) n, where n

≥ 3. Some carbohydrates contain nitrogen, phosphorus, or sulfur. There are three major size classes of

carbohydrates. These are monosaccharides, oligosaccharides, and polysaccharides.

Monosaccharides are the simplest building blocks of carbohydrates; consist of a single polyhydroxy

aldehyde or ketone unit. They typically contain from three to nine carbon atoms that are bound to

multiple hydroxyl groups. They vary in number of carbon atoms (size) and in the stereo chemical

configuration at one or more carbon centers. They cannot be further hydrolyzed to simpler forms.

Monosaccharides of more than four carbons tend to have cyclic structures. The six-carbon sugar D-

glucose or dextrose is the most abundant monosaccharide in nature. It is produced from CO2 and H2O in

the process of photosynthesis. Many monosaccharides are also synthesized from simpler substances in a

process named gluconeogenesis. Nearly all biological molecules can be generated from the products of

photosynthesis and gluconeogenesis. Monosaccharides are important components of nucleic acids and

complex lipids.

Oligosaccharides are short chains of monosaccharide units, or residues covalently linked together by

characteristic linkages called glycosidic bonds. A few types of monosaccharide units can be joined to

form a large variety of oligosaccharides and polysaccharides. The most abundant oligosaccharides are

the disaccharides, with two monosaccharide units. Sucrose (cane sugar), which consists of the six-carbon

sugars D-glucose and D-fructose, is a typical disaccharide. Oligosaccharides are often associated with

proteins (glycoproteins) and lipids (glycolipids) in which they have both structural and regulatory

functions. Glycoproteins and glycolipids are collectively called glycoconjugates.

Polysaccharides are carbohydrate polymers containing 20 or more monosaccharide units. Insoluble

polysaccharides such as cellulose in plants, glycosaminoglycans (GAGs) in animals, and chitin in fungi are

important structural materials. GAGS, large polymers made up of many repeats of dimers, are key

components of cartilage. Thes polymers serve as structural and protective elements in the cell walls of

3 | P a g e

bacteria and plants and in the connective tissues of animals. Others such as starch and glycogen are

storage polysaccharides. Starch is an important nutritional reservoir in plants where as glycogen is in

animals. Still other polysaccharide polymers lubricate skeletal joints. Structural polysaccharides such as

cellulose have linear chains whereas storage polysaccharides such as glycogen and starch have branched

chains. Both glycogen and cellulose consist of recurring units of D-glucose, but they differ in the type of

glycosidic linkage and consequently have strikingly different properties and biological roles.

Carbohydrates are not only the most important sources of energy for all living organisms they are also

the most complex sources of information for molecular recognition. The oxidation of carbohydrates

through metabolic breakdown of monosaccharides is the central energy-yielding pathway of cellular life

forms. All cells are coated in a dense and complex coat of oligosaccharides. The diverse oligosaccharide

structures displayed on cell surfaces are well-suited as sites of interaction between cells and their

environments. Oligosaccharide-containing glycoproteins are important mediators of cell–cell

recognitions such as fertilization, differentiation and the aggregation of cells to form tissues and organs.

Glycoproteins are also implicated in interactions between variety of pathogenic bacteria and viruses and

their host cells. They are the basis of human blood groups.

A key property of carbohydrates is the possibility of attaining tremendous structural complexity and

diversity. The branched structures of oligosaccharides greatly increase their complexity and diversity.

The sheer diversity and complexity of oligosaccharides and polysaccharides suggest that carbohydrates

are information-rich molecules. They can augment the already immense diversity of proteins. Cell-

surface oligosaccharides are important for intercellular communications but not for intracellular

housekeeping function. Besides, secreted proteins are often glycoproteins that are extensively

decorated with oligosaccharides essential to their structures and functions. Oligosaccharides covalently

attached to proteins or lipids act as signals that participate in recognition and adhesion between cells.

The intracellular location or metabolic fate of these hybrid molecules, glycoconjugates, is determined by

carbohydrates.

Glycobiology

Glycobiology is the field of study that characterizes the structure and function of carbohydrates and

their conjugates. It deals with the synthesis and degradation as well as how carbohydrates are attached

to and recognized by other molecules such as proteins. Glycome refers to all of the carbohydrates and

carbohydrate-associated molecules that cells produce. It complements genomics (for DNA) and

4 | P a g e

proteomics (for proteins). It is clear that the glycome is very dynamic. It varies depending on, the type of

species, the type of cellular or environmental conditions.

The branched structure of oligosaccharides greatly increases their complexity and hence the difficulty in

determining their sequences. The complexity of an organism’s glycome greatly exceeds that of its

proteome due to the diversity the glycome’s constituent monosaccharides and the number of ways they

can interact with one another and with proteins. Characterization of oligosaccharides is complicated by

microheterogeneity, which often has biological significance. Because the biosynthesis of carbohydrates

is not under direct genetic control, there is currently no method for amplifying them. Methods for

synthesizing specific oligosaccharides have been hampered by extensive branching of oligosaccharides,

their large number of functional groups that must be differentially protected during elongation

reactions, and the chiral nature of glycosidic bond. Thus, the only way of obtaining sufficient quantities

of a particular polysaccharide was to isolate it from natural sources.

Carbohydrate microarray is the use of a particular fluorescently labeled prey (protein, RNA, or cell type)

and several thousand different oligosaccharides (baits) that are covalently or physically immobilized at

specific sites on a solid surface (glass slide) to identify the carbohydrates that specifically bind to a

particular prey. The mixture is incubated, rinsed, and subsequently the oligosaccharides to which the

prey binds are identified by the fluorescence at their corresponding positions. Carbohydrate microarrays

have been employed in both basic and applied research including disease diagnosis and the

development of carbohydrate based drugs and vaccines.

Monosaccharides

Monosaccharides are derivatives of either aldehydes or ketones of straight-chain polyhydroxy alcohols

containing at least three carbon atoms. They are colorless, sweet and crystalline solids that are freely

soluble in water but insoluble in nonpolar solvents. Their backbones are unbranched and all the carbon

atoms are linked by single bonds. The carbons of a monosaccharide are numbered beginning at the end

of the chain nearest to the carbonyl group. Many of the hydroxyl groups are attached to chiral carbon

centers generating several sugar stereoisomers found in nature. There are two families of

monosaccharides based on the chemical nature of their carbonyl group. Aldoses are monosaccharides in

which the carbonyl group is at an end of the carbon chain and hence have aldehyde functional group.

The smallest aldose is glyceraldehyde (GAL) with D-and L-forms. Ketoses are monosaccharides which

have a ketone functional group. The smallest ketose is dihydroxyacetone (DHA).

5 | P a g e

Classification

Monosaccharides are classified according to the number of their carbon atoms. The most common

monosaccharides with three, four, five, six, and seven carbon atoms in their backbones are called,

respectively, trioses, tetroses, pentoses, hexoses, and heptoses. The hexoses are the most common

monosaccharides in nature. The terms indicating the two families can be combined with those terms

indicating the number of carbon atoms. For example glucose is an aldohexose whereas ribulose is a

ketopentose.

Configuration and conformation

Configuration refers to the covalent binding pattern in a molecule and it is constant for a molecule.

Configuration is static; the primary structure is the configuration. Conformation refers to the three-

dimensional structure of a molecule and it depends on molecular environment (temperature, pH,

dielectric constant, ionic strength and interaction with other molecules). Conformation is dynamic; there

is always a range of different structures that the molecules sample at equilibrium. Secondary, tertiary

and quaternary structures are conformations. Two conformations of a molecule can be interconverted

without the breakage of covalent bonds whereas two configurations can be interconverted only by

breaking covalent bond.

Stereoisomers

Isomers are two or more molecules that have the same molecular formula but different structures. They

have different physical and chemical properties. Constitutional isomers differ in the order of attachment

of atoms. For example, DHA and GAL are constitutional isomers. On the other hand, stereoisomers have

atoms that are connected in the same order but differ in spatial arrangement. Since stereoisomers are

optically active, they are optical isomers. All monosaccharides except dihydroxyacetone contain one or

more asymmetric (chiral) carbon centers. Therefore, they exist in a variety of stereochemical forms. The

molecular formula of aldohexoses indicates that all but two of its carbon atoms, C1 and C6, are chiral

centers. In general, n-carbon aldoses have n-2 chiral centers and 2n-2 stereoisomers. Glyceraldehyde has

three carbons, one chiral center and 21 = 2 stereoisomers; aldohexoses have six carbons, four chiral

centers and 24 = 16 stereoisomers of which glucose is one such isomer. The position of the carbonyl

group gives keto sugars one less asymmetric center than their isomeric aldoses. Therefore, n-carbon

ketoses have n-3 chiral centers and 2n-3 stereoisomers. The most common ketoses have their ketone

group at C2.

6 | P a g e

By convention, the asymmetric center farthest removed or most distant from the carbonyl group of a

monosaccharide of any carbon-chain length is called reference carbon. The stereoisomers of a

monosaccharide can be divided into two groups that differ in the configuration at the reference chiral

center. Sugars that have the same absolute configuration at the reference carbon as that of D-

glyceraldehyde are designated D isomers, and those with the same configuration as L-glyceraldehyde

are L isomers. Enantiomers are isomers that are non-superimposable mirror images of each other. D-and

L-glucose are enantiomers with respect to C6. Fischer projection formula can be used to represent

three-dimensional sugar structures on paper. When the hydroxyl group on the reference carbon is on

the right in the projection formula, the sugar is the D isomer; when on the left, it is the L isomer. As for

other biomolecules with chiral centers, the absolute configurations of monosaccharides are determined

by X-ray crystallography.

Diastereoisomers are isomers that are not mirror images of each other. Epimers are two sugars that

differ in the configuration at only a single asymmetric carbon atom. Thus, D-glucose and D-mannose are

epimers with respect to C2, whereas D-glucose and D-galactose are epimers with respect to C4. Epimers

are diastereomers. Isomeric forms of monosaccharides that differ only in their configuration about the

hemiacetal or hemiketal carbon atom are called anomers. The hemiacetal (or carbonyl) carbon atom is

called the anomeric carbon. Anomers are isomers that differ at a new asymmetric carbon atom formed

upon ring closure.

D-sugars are biologically much more abundant than L-sugars. D-Glucose commonly occurs in nature as a

monosaccharide. Several other aldoses such as D-GAL, D-ribose, D-mannose, and D-galactose are

important components of larger biological molecules. The four- and five-carbon ketoses are designated

by inserting “ul” into the name of a corresponding aldose. The most abundant ketohexose is D-fructose

(Latin: fructus, fruit). Fructose is commonly used as a sweetener and fruits are rich in fructose. Fructose

is converted into glucose derivatives inside the cell. Other biologically prominent ketoses include DHA,

7 | P a g e

D-ribulose, D-xylulose and D-sorbose. Sorbus is the genus of mountain ash which has berries rich in the

related sugar alcohol sorbitol. Some sugars such as L-arabinose occur naturally in their L forms.

Table of aldose sugars

Carbon atoms Group name Examples with suffice of -ose

3 Trioses Glyceraldehyde

4 Tetroses Erythrose, Threose

5 Pentoses Ribose (Rib), Arabinose (Ara), Xylose (Xyl), Lyxose (Lyx)

6 Hexoses Allose, Altrose, Glucose (Glc), Mannose (Man),

Gulose, Idose, Galactose (Gal), Talose,

7 Heptoses

Simple sugars serve not only as fuel molecules but also as fundamental constituents of biological

macromolecules. Glucose is an essential energy source for virtually all forms of life. RNA has a backbone

consisting of alternating phosphoryl groups and D-ribose, a cyclic five-carbon sugar. D-deoxyribose is the

monosaccharide component of DNA.

Table of ketose sugars

Carbon atoms Group name Examples (insertion of –ul-before suffix)

3 Trioses Dihydroxyacetone

4 Tetroses Erythrulose

5 Pentoses Ribulose, xyluose

6 Hexoses Psicose, Fructose, Sorbose, tagatose,

7 Heptoses

Chemical properties

Monosaccharides can chemically reactive molecules. The chemistry of monosaccharides is largely the

function of their carbonyl and hydroxyl groups. The three most common reaction partners are alcohols,

phosphates, and amines. Monosaccharides can also undergo oxidation-reduction reactions. The

8 | P a g e

carbonyl groups of monosaccharides can be oxidized to aldonic and glycuronic acids whereas the

hydroxyl groups can be reduced to alditols.

Reaction with alcohols

Reaction of monosaccharides with alcohol functional groups is the chemical basis for the formation of

cyclic sugar structures and polymerization of monosaccharides by glycosidic bond. The carbonyl groups

of aldehydes and ketones react with alcohols in a 1:1 ratio to yield hemiacetals and hemiketals,

respectively. Likewise, the aldehyde or the ketone functional groups of monosaccharides can react

intramolecularly with hydroxyl groups along the chain to form cyclic intramolecular hemiacetals and

hemiketals. This is the chemical basis for the spontaneous conversion of the open-chain form of

monosaccharides into ring or cyclic form.

Cyclic structures

Many common monosaccharide units occur predominantly as cyclic structures in aqueous solutions and

inside the cells. For an aldohexose such as D-glucose, the C1 aldehyde in the open-chain form reacts

with the C5 hydroxyl group to form an intramolecular hemiacetal. The resulting cyclic, six-membered

ring is called D-glucopyranose because of it resembles the smallest six-membered compound pyran.

Only aldoses having five or more carbon atoms can form pyranose rings. For ketohexose such as

fructose, the C2 keto group in the open-chain form reacts with either the C6 or C5 hydroxyl group to

form a six or five-membered cyclic ring. D-Fructose readily forms the the five-membered D-

fructofuranose ring in analogy with furan.

The formation of cyclic or ring form of a monosaccharide renders the former carbonyl carbon

asymmetric generating yet another type of diastereoisomeric pairs. The creation of a new chiral center

adds further stereochemical complexity to this class of compounds. The resulting pairs of diastereomers,

designated as α- and β-forms, are known as anomers. In this designation, α means that the hydroxyl

substituent to the anomeric carbon is on the opposite side of the sugar ring from the chiral center that

designates the D or L configuration whereas β means that the hydroxyl group is on the same side of the

ring as C6. The Haworth perspective formula is a very convenient representation of the configurations of

9 | P a g e

the substituents to each carbon atom and the stereochemistry of the cyclic form of a monosaccharide.

In these depictions, the carbon atoms in the ring are not written out. The plane of the ring is nearly

perpendicular to the plane of the paper and heavy lines on the ring show projections toward the reader.

The biological properties and functions of monosaccharide units are determined by their specific three-

dimensional conformations.

Hexoses and pentoses may each assume pyranose or furanose forms. The stabilities of the five- and six-

membered rings of these sugars are so great that their cyclic forms predominate in aqueous solution

and inside the cell. The linear forms are normally present in only minute amounts. The six-membered

aldopyranose ring is much more stable than the aldofuranose ring and predominates in aldohexose

solutions. Glucose almost exclusively assumes its D-glucopyranose form in aqueous solutions. In glucose,

C1 (the carbonyl carbon atom in the open-chain form), which becomes a new asymmetric center, is

called anomeric carbon atom. The two anomeric forms are α-D-glucopyranose and β-D-glucopyranose.

An equilibrium mixture of D-glucose contains approximately one-third α-anomer, two-thirds β-anomer,

and, 1% of the open chain form. The two anomeric forms can be interconverted in aqueous solution

only by passing via the open-chain form of glucose by a process called mutarotation.

Fructose forms both six-membered pyranose and five-membered furanose rings. In fructose, C2 (the

carbonyl carbon atom in the open-chain form), which becomes a new asymmetric center, is the

anomeric carbon atom. Fructose that is free in solution is 67% D-fructopyranose and 33% D-

fructofuranose. In each case, both α and β anomers are possible. The β-D-fructopyranose form that

predominates in aqueous solution is one of the sweetest chemicals known. It is commonly used as

sweetener. It is found in honey and high fructose corn syrup. Heating converts β-D-fructopyranose form

into the β-D-fructofuranose form, reducing the sweetness of the solution. That is why corn syrup with a

high concentration of fructose in the β-D-fructopyranose form is used as a sweetener in cold, but not,

hot drinks. Excessive fructose consumption leads to fatty liver, insulin insensitivity and obesity. On the

other hand, the furanose form predominates in many fructose derivatives. The most common anomer

of fructose in combined forms or in derivatives is β-D-fructofuranose. Similarly, the pyranose form of

ribose predominates in aqueous solution whereas its furanose form predominates in many ribose

derivatives such as DNA and RNA. Ribose that is free in solution is 75% pyranose and 25% furanose.

The use of Haworth projections may lead to the erroneous impression that furanose and pyranose rings

are planar. They are not planar because of the sp3-hybridized tetrahedral geometry of their saturated

carbon atoms. The non-polar pyranose ring, like the substituted cyclohexane ring, may adopt two

10 | P a g e

classes of conformations, termed chair and boat because of the resemblance to these objects. The

substituents to the ring on the chair conformer can have two orientations: axial (ax) and equatorial (eq).

Axial groups are nearly perpendicular to the average plane of the ring or parallel to threefold rotational

axis, whereas equatorial groups are roughly parallel to this plane. Axial substituents are rather close-

fitting and sterically hinder each other if they emerge on the same side of the ring. In contrast,

equatorial substituents are staggered and therefore minimally crowded. The chair form of β-D-

glucopyranose is more stable because hydrogen atoms occupy the axial positions but the boat form is

disfavored because of steric hindrance. Note that β-D-glucose is the only D-aldohexose that can

simultaneously have all five non-H substituents in the equatorial position.

Like pyranose rings, furanose rings are not planar. They can be puckered so that four atoms are nearly

coplanar and the fifth atom is about 0.5 Å away from this plane. This conformation is called an envelope

form because the structure resembles an opened envelope with the back flap raised. In the ribose

moiety of most biomolecules, either C2 or C3 is out of the plane and on the same side as C5. These

conformations are called C2-endo and C3-endo, respectively. The remaining four atoms lie

approximately in a plane.

11 | P a g e

Glycosidic bonds

Addition of alcohol functional group to monosaccharides produces an acetal or ketal. When the alcohol

functional group is part of another sugar molecule, the bond produced is called glycosidic bond. If it

connects the anomeric carbon atom of a carbohydrate to the oxygen atom of an alcohol, it is called O-

glycosidic bond. In an acid catalyzed condensation, the anomeric hydroxyl substituent of a sugar

reversibly reacts with alcohols to form α-and β-glycosidic bond. This reaction represents the formation

of an acetal from a hemiacetal and an alcohol (a hydroxyl group). Individual monosaccharide units are

held together by O-glycosidic bonds to form long polysaccharide polymers. Therefore, glycosidic bond is

the carbohydrate analog of the peptide bond in proteins. Some oligosaccharides are linked to proteins

by O-glycosidic bond.

If a bond joins the anomeric carbon atom of a sugar to a nitrogen atom in glycoproteins and nucloetides,

it is called N-glycosidic bond. N-glycosidic bond is responsible for linking ribose residues to nitrogenous

bases in nucleosides. The reversal of this reaction is hydrolysis—attack by H2O on the glycosidic bond.

Glycosidic bonds are readily hydrolyzed by acid but resist cleavage by base. Disaccharides can be

hydrolyzed to yield their free monosaccharide components by boiling with dilute acid. Glycosidases

catalyze the hydrolysis of glycosidic bonds. Glycosidic bonds do not undergo mutarotation. Glycosidases

differ in specificity according to the identity and anomeric configuration of the glycoside.

Reactions with phosphates and amines

The biochemical properties of monosaccharides can be modified by reaction with other molecules such

as alcohols, amines and phosphates. Monosaccharides are modified by alcohols and amines through the

formation of glycosidic bonds. On the other hand, they are modified by phosphates through the

formation of phosphate ester bonds. Monosaccharides can also be modified by the addition of

functional groups to carbons other than the anomeric carbon. These modifications increase the

biochemical versatility of carbohydrates, enabling them to serve as signal molecules or facilitating their

metabolism. Monosaccharides in which a hydroxyl group in the parent compound is replaced with other

substituents are called sugar derivatives. Living organisms contain a wide variety of biologically

important sugar derivatives.

Reactions with phosphates

12 | P a g e

Monosaccharide units in which a phosphoryl group is transferred from ATP to a hydroxyl group of the

sugar are known as phosphorylated sugars. Phosphorylation is a common modification of sugars.

Phosphorylated sugars are critical intermediates in energy generation and biosynthesis. Phosphorylated

sugars are so reactive intermediates that they will more readily undergo metabolism. The conversion of

glucose into glucose 6-phosphate is an example of metabolite activation. Besides, several multiply

phosphorylated derivatives of ribose play key roles in the biosynthesis of purine and pyrimidine

nucleotides. Phosphorylation not only activates sugars for subsequent chemical transformation but it is

also used as a regulatory mechanism. Phosphorylation makes sugars anionic at neutral pH which traps

the phosphorylated sugar inside the cell. Phosphorylated sugars are prevented from spontaneously

leaving the cell by crossing lipid bilayer membranes. Phosphorylated sugars are unable to interact with

transporters of the unmodified sugars. Most cells do not have membrane transporters for

phosphorylated sugars.

Reactions with amines

Monosaccharide units in which one or more hydroxyl groups are replaced by an often acetylated amino

group are called amino sugars. Substitution of OH in the parent monosaccharide for NH2 group produces

an amino sugar. The hydroxyl group at C2 of the parent compound is replaced with an amino group in α-

D-glucosamine (2-amino-2-deoxy-α-D-glucopyranose), α-D-galactosamine (2-amino-2-deoxy-α-D-

galactopyranose), and D-mannosamine. D-glucosamine and D-galactosamine are critical components of

numerous biologically important polysaccharides. The amino group is nearly always condensed with

acetic acid, as in N-acetylglucosamine (GlcNAc, NAG), N-acetylgalactosamine (GalNAc), N-acetylmuramic

acid (Mur2Ac, NAM) and N-acetylneuraminic acid (Neu5Ac, NANA) which is also known as sialic acid

(Sia). NAM is consisted of D-lactic acid (a three-carbon carboxylic acid) ether linked to the oxygen at C3

of NAG. NANA is derived from N-acetylmannosamine and pyruvic acid. NAG and NAM are prominent

component of peptidoglycan bacterial cell walls. N-acetylgalactosamine is the carbohydrate moiety

bound to the protein in mucins. NANA is an important constituent of glycoproteins and glycolipid.

Oxidation–reduction reactions

Mild oxidation of the carbonyl (aldehyde) carbon of aldoses into carboxylic acids, either chemically or

enzymatically, yields aldonic acids. The systematic name of aldonic acids involves appending the suffix -

onic acid to the root name of the parent aldose. Example is D-gluconic acid. The hydroxyl group attached

to the carbon atom at the other end of the backbone chain such as C6 of glucose is called primary

13 | P a g e

alcohol. Oxidation of the primary alcohol groups at the reference carbon of aldoses yields uronic acids,

which are named by appending -uronic acid to the root name of the parent aldose. D-Glucuronic acid, D-

galacturonic acid and D-mannuronic acid are important components of many polysaccharides. At

physiologically important pH, the carboxylic acid groups of the acidic sugar derivatives are ionized. These

acidic sugars contain a carboxylate group, which confers a negative charge at neutral pH. Therefore, the

resulting compounds are also named as the carboxylates—glucuronate, galacturonate, and so forth.

Both aldonic and uronic acids form lactones. Lactones are stable intramolecular esters or cyclic esters.

D-Glucono-δ-lactone is formed by an ester linkage between the C1 carboxylate group and the C5 (also

known as the δ-carbon) hydroxyl group of D-gluconate.

Mild reduction of the carbonyl carbon of aldoses and ketoses into hydroxyl groups, either chemically by

treatment with NaBH4 or enzymatically, yields acyclic polyhydroxy alcohols known as alditols. The

systematic naming of alditols involves appending the suffix -itol to the root name of the parent aldose.

Ribitol is a component of flavin coenzymes FMN and FAD. Glycerol and the cyclic polyhydroxy alcohol

myo-inositol are important lipid components. Xylitol is a sweetener that is used in “sugarless” gum and

candies. Either L-sorbitol or D-glucitol can be produced by reduction of D-glucose.

Reducing sugars

Saccharides bearing free carbonyl carbon atoms that are capable of reducing relatively mild oxidizing

agents such as silver (Ag+), ferric (Fe3+) or cupric (Cu2+) ion are called reducing sugars. In the hemiacetal

(ring) form the carbonyl carbon (aldehyde or ketone) of reducing sugars cannot be oxidized to a carboxyl

group (aldonic acid). Oxidation of the anomeric carbon of the cyclic form which exists in equilibrium with

the linear form occurs only in the open chain form. This property is the basis of Tollens’ reaction, a

classic test for the presence of a reducing sugar. The reduction of Ag+ in an ammonia solution (Tollens’

reagent) yields a metallic silver mirror lining on the inside of the reaction vessel. Another qualitative test

for the presence of a reducing sugar is Fehling’s reaction. Reducing sugars reduce cupric ion (Cu2+) in

Fehling’s solution into cuprous ion (Cu+) while being oxidized to aldonic acid. The cuprous ion (Cu+)

produced under alkaline conditions forms a red cuprous oxide precipitate.

Reducing sugars can often nonspecifically react with a free amino group of proteins to form a stable

covalent bond. These modifcations, known as advanced glycation end products (AGE), have been

implicated in aging, arteriosclerosis, and diabetes, as well as other pathological conditions. Glucose

reacts with hemoglobin to form glycosylated hemoglobin. Monitoring changes in the amount of

14 | P a g e

glycosylated hemoglobin is an especially useful means of assessing the effectiveness of treatments for

diabetes mellitus. Blood glucose concentration is commonly determined by measuring the amount of

H2O2 produced in the reaction catalyzed by glucose oxidase. In the reaction mixture, a second enzyme,

peroxidase, catalyzes reaction of the H2O2 with a colorless compound to produce a colored compound,

the amount of which is then measured spectrophotometrically.

Deoxy sugars

Monosaccharide units in which an OH group is replaced by H are known as deoxy sugars. Deoxy sugars

are found in plant polysaccharides and in the complex oligosaccharide components of glycoproteins and

glycolipids. The most important deoxy sugar is β-D-2-deoxyribose. It is the sugar part of the sugar–

phosphate backbone of DNA. L-Rhamnose (6-deoxy-L-mannose) and L-fucose (6-deoxy-L-galactose, Fuc)

are widely occurring components of biologically important polysaccharides. Note that most deoxy

sugars occur in nature as the L isomers.

Oligosaccharides

Oligosaccharides are short polymers of monosaccharide units that are often attached to proteins or

lipids at the cell surface. They have both structural and regulatory functions. Some oligosaccharides,

with six or more different sugars connected in branched chains, carry information. They functions as

specific cellular signals and biological markers to provide highly specific points of recognition in many

cellular processes. Disaccharides such as maltose, lactose, and sucrose are the smallest and most

abundant oligosaccharides. They are formed from two monosaccharides joined covalently by an O-

glycosidic bond. An O-glycosidic bond is formed when an alcohol (OH) of one monosaccharide

condenses with the intramolecular hemiacetal anomeric carbon of another monosaccharide with

elimination of H2O.

Nomenclature

A systematic name is required to describe complex oligosaccharides unambiguously. It involves

identifying all the component monosaccharides, specifying their anomeric forms, their ring types and

how they are linked together. By convention, the first and last monosaccharide units are located at the

left (nonreducing end) and the right (reducing end) respectively. The end of an oligosaccharide chain

with a free anomeric carbon (one not involved in a glycosidic bond) is called the reducing end and it

15 | P a g e

contains the reducing residue. Three-letter abbreviations for the monosaccharides are often used in

systematic names.

The three-letter codes for the common sugars include ribose (Rib), xylose (Xyl), arabinose (Ara), glucose

(Glc), mannose (Man), galactose (Gal), rhamnose (Rha), fructose (Fru), fucose (Fuc) and abequose (Abe).

Sugar derivatives can follow the same nomenclature. Acids can be described as glucuronic acid (GlcA),

iduronic acid (IdoA), muramic acid (Mur) and neuraminic acid (Neu). Simple amines have identifying N

such as glucosamine (GlcN) and galactosamine (GalN). On the other hand, acetylated amines can be

described as N-acetylglucosamine (GlcNAc), N-acetylgalactosamine (GalNAc) N-acetylmuramic acid

(Mur2Ac) and N-acetylneuraminic acid (Neu5A).

There are four useful conventional rules to describe the systematic name a complex oligosaccharide.

First, name the configuration of the anomeric carbon of the nonreducing residue. This is specified by

either α- or β-form. Second, name the first monosaccharide unit and its ring type using the prefix

“furano” or “pyrano” to distinguish five- and six-membered rings. Third, name the carbon atoms

involved in the glycosidic bond linking together the first and the second monosaccharide units. Indicate

in parentheses using an arrow to connect the two numbers for the carbon atoms. For example, (1→4)

indicates that C1 of the first-named sugar residue is joined to C4 of the second. Fourth, name the second

monosaccharide unit following the same rules as before.

The most common glucosyl–glucose disaccharides include maltose [O-α-D-glucopyranosyl-(1→4)-α-D-

glucopyranose], isomaltose or dextrin [O-α-D-glucopyranosyl-(1→6)-α-D-glucopyranose, trehalose [O-α-

D-glucopyranosyl-(1→1)-α-D-glucopyranoside and cellobiose [O-β-D-glucopyranosyl-(1→4)-β-D-

glucopyranose]. Isomaltose or limit dextrin is the material fragments of amylopectin not easily digestible

because of α 1→6 branch point. It is hydrolyzed by α-dextrinase or debranching enzyme. Trehalose Glc

(α1↔ α1) Glc is nonreducing sugar. It is a major constituent of the circulating fluid (hemolymph) of

insects, serving as an energy-storage compound.

Maltose

Maltose is a disaccharide formed from two molecules of D-glucose joined by a glycosidic bond between

C1 (the anomeric carbon) of one glucose residue and C4 of the other. The configuration of the anomeric

carbon atom forming the glycosidic bond is α. Hence, the abbreviated systematic name for maltose is

Glc (α1→4) Glc. Since maltose retains a free anomeric carbon (C1) of the second glucose residue on the

16 | P a g e

right, it is a reducing sugar. Maltose is hydrolyzed by maltase, an enzyme found in lysosomes. Maltase is

α-1→4 glucosidase. Enzymatic hydrolysis of starch produces maltose.

Lactose

Lactose or milk sugar is a disaccharide formed from one molecule of D-galactose and one molecule of D-

glucose joined by a glycosidic bond between C1 of galactose residue and C4 of the glucose residue. The

configuration of the anomeric carbon atom forming the glycosidic bond is β. Hence, the systematic

name for lactose is [O-β-D-galactopyranosyl-(1→ 4)-β-D-glucopyranose] or in short Gal (β1→4) Glc. The

free anomeric carbon of its glucose residue makes lactose a reducing sugar. Lactose is hydrolyzed to its

component monosaccharides for absorption into the blood stream by the intestinal enzyme lactase or β-

D-galactosidase. Infants normally express lactase. However, most African and almost all Asian adults

have very low levels of lactase. Consequently, much of the lactose in milk product moves through their

digestive tract to large intestine. Lactose is a good source of energy for bacterial fermentation in colon.

Bacterial fermentation converts lactose into irritating lactic acid and large quantities of CH4, CO2 and H2.

These gases create the problem of flatulence. Lactate draws water by active osmosis resulting diarrhea.

This type of abdominal cramps and diarrhea is called lactose intolerance. Lactase-containing pills and

lactose free milk products are now widely available.

Sucrose

Sucrose or table sugar is a disaccharide of one molecule of D-glucose and one molecule of D-fructose

joined by a glycosidic bond between C1 of the glucose residue and C2 of the fructose residue. The

configuration of the anomeric carbon atom forming the glycosidic bond is α. Hence, the systematic

name for sucrose is [O-α-D-glucopyranosyl-(1→2)-β-D-fructofuranoside] or in short Glc (α1↔2β) Fru.

Sucrose has no free anomeric carbon and hence it is a nonreducing sugar. Nonreducing disaccharides

are named as glycosides. Sucrose can also be represented as Fru (2β↔α1) Glc. Sucrose is the most

abundant disaccharide. It is the major intermediate product of photosynthesis throughout the plant

kingdom. Sucrose is synthesized in the cytoplasm and transported from the leaves to other parts of the

plant body. However, sucrose cannot be synthesized in animals. Sucrose is hydrolyzed to its component

monosaccharides D-glucose and D-fructose by sucrase or β-D-fructofuranosidase (Fru (2β→ α1) Glc).

The hydrolysis of sucrose is accompanied by a change in optical rotation from dextro to levo.

Consequently, hydrolyzed sucrose is sometimes called invert sugar and the enzyme that catalyzes this

process is archaically named invertase.

17 | P a g e

Polysaccharides

Polysaccharides or glycans are large polymers of monosaccharides linked together by glycosidic bonds

which can be made to any of the hydroxyls of a monosaccharide. Unlike proteins and nucleic acids,

polysaccharides can form branched as well as linear polymers. Given the variety of different

monosaccharides that can be put together in any number of arrangements, the number of possible

polysaccharides is huge. Homopolysaccharides and heteropolysaccharides are consisted of only one

type and more than one type of monosaccharide residue, respectively. Homopolysaccharides may be

further classified on the basis of their monomeric unit. Polymers of glucose are called glucans, whereas

polymers of galactose are called galactans.

Storage polysaccharides

Glucose is an important source of energy for virtually all life forms. However, free glucose molecules

cannot be stored to maintain the osmotic balance of the cell. Glucose is rather stored in the form of

readily accessible storage polysaccharides such as starch and glycogen. The storage of glucose as

polysaccharides greatly reduces the large intracellular osmotic pressures. Starch and glycogen are

glucans in which the component glucose molecules are linked together by α-1→4 glycosidic bonds. The

α-1→4 glycosidic bonds are responsible for the hollow helical structures of starch and glycogen. This

compact and accessible hollow helix is well-suited for storage.

Starch

Starch is the nutritional reservoir in plants which is synthesized and deposited in the cytoplasm of plant

cells as insoluble granules composed of α-amylose and amylopectin. The α-amylose form is a linear

polymer of several thousand glucose residues linked by α-(1→4) glycosidic bond. It is the unbranched

form of starch. Amylopectin is the branched form of starch containing about one α-(1→6) branching

point per approximately 27 α-(1→4) linkage. The α-amylose form adopts an irregularly aggregating

helically coiled conformation due to α-(1→4) glycosidic bonds. This left-handed helix has 6 regularly

repeating glucose residues per turn.

Starch is a major source of carbohydrates for animals. It is rapidly hydrolyzed by α-amylase, an enzyme

secreted by the salivary glands and the pancreas. Salivary and pancreatic α-amylases hydrolyze all α-(1→

4) glycosidic bonds of starch except the outermost bonds and those next to branches. Salivary α-amylase

is inactivated by the low pH in the stomach. These enzymes degrade starch to a mixture of the

18 | P a g e

disaccharide maltose, the trisaccharide maltotriose and the oligosaccharide dextrin. Maltotriose

contains three α-(1→ 4) linked glucose residues whereas dextrin contains three α-(1→ 4) linked glucose

residues and at least one α-(1→6) branch. It is the end products of the exhaustive digestion of

amylopectin by α-amylase. Maltose is cleaved into two glucose molecules by maltase. Maltotriose and

other undigested oligosaccharides are digested by α-glucosidase which removes one glucose residue at

a time from oligosaccharides. Dextrin is hydrolyzed by α-dextrinase or debranching enzyme which

hydrolyzes both α-(1→6) and α-(1→4) bonds. These specific enzymes are contained in the intestinal

brush border (the fingerlike microvilli of intestinal epithelial cells). The resulting monosaccharides are

absorbed by the intestinal membrane and transported to the bloodstream.

Glycogen

Glycogen is the major storage form of carbohydrate in animals which is synthesized and deposited in the

cytoplasm of animal cells as cytoplasmic granules of a large, branched polymer of glucose residues. It is

present in all cells but is most prevalent in skeletal muscle and liver cells. Most of the glucose units in

glycogen are linked by α-(1→4) glycosidic bonds. The branches are formed by α-(1→6) glycosidic bonds,

present about once in 12 units. Glycogen has a similar structure as amylopectin but it is more highly

branched. The highly branched structure of glycogen generates many nonreducing ends, permitting the

rapid mobilization of glucose in times of metabolic need. Glycogen phosphorylase phosphorolytically

degrades α-(1→4) bonds in glycogen sequentially inward from nonreducing ends to yield glucose-1-

phosphate. The remaining α-(1→6) branches of glycogen are cleaved by a debranching enzyme.

Structural polysaccharides

Storage polysaccharides such as starch and glycogen are formed by α-(1→4) glycosidic bond. In contrast,

structural polysaccharides such as cellulose and chitin are formed by β-(1→4) linkages. This simple

difference in stereochemistry results in two groups molecules with very different structural properties

and biological functions. Cellulose is the primary structural component of plant cell walls. Chitin is as an

important component of exoskeleton. Chitin is the second most abundant polysaccharide in the

biosphere next to cellulose.

Cellulose

Cellulose is a linear polymer of β (1→4)-linked D-glucose residues found in plant cell walls. Hence, it is an

unbranched glucan. The primary structure of cellulose fiber was determined through methylation

19 | P a g e

analysis whereas its secondary structure was determined by X-ray fiber diffraction. Each of the β (1→4)-

linked successive glucose residue in a chain is flipped 180° with respect to its preceding residue in such a

way that the linear polymer assumes a straight chain, very long and fully extended conformation. Each

strand is held in this very long, thin and fully extended conformation by intrachain hydrogen bonds.

About 36 parallel straight chains arranged in an extended fashion are easily packed into a rigid almost

crystalline cellulose fiber. Parallel strands in each cellulose fibril or cellulose cable interact with one

another through intermolecular hydrogen bonds between glucose units of neighboring chains.

Multiple cellulose fibers line up laterally to form sheets, and these sheets stack vertically. The entire

assembly is stabilized by intramolecular and intermolecular hydrogen bonds. Cellulose is insoluble in

water despite the high hydrophilic nature of the constituent D-glucose due to extensive hydrogen

bonding. This highly cohesive, hydrogen bonded structure gives cellulose fibers exceptionally high

tensile strength suitable for supportive function. The rigid plant cell is able not only to withstand high

osmotic pressure but it also has a load-bearing function.

In plant cell wall, fibrous cellulose structure is cross-linked by other polysaccharides and embedded in an

amorphous cementing matrix containing lignin and pectin. Lignin is a plastic like phenolic polymer

derived from phenylalanine and tyrosine. Pectin is a polygalacturonic acid polysaccharide that gives

tomatoes and other fruits their firmness. It is naturally depolymerized by the enzyme polygalacturonase

(PG). As pectin is hydrolyzed, the tomatoes soften, making rigors of shipping very difficult. Inhibiting PG

can delay ripening facilitating storage and shipment. Cellulose also occurs in the stiff outer mantles of

marine tunicates.

Vertebrates including herbivorous mammals do not possess any enzyme to hydrolyze the β (1→4)-

glycosidic bond. Hence, they are unable to digest wood and vegetable fibers. However, herbivores and

termites contain symbiotic microorganisms that secrete a series of enzymes, collectively known as

cellulases. Since cellulose is tightly packed and the D-glucose units are not easily accessible for enzymes,

the degradation of cellulose is a very slow process. Cellulose and other insoluble fibers can minimize

exposure time to toxins in the diet by increasing the rate at which digestion products pass through the

large intestine. On the other hand, pectin (polygalacturonic acid) and other soluble fibers allow

improved digestion and absorption of nutrients by slowing down the movement of food through the

gastrointestinal tract.

Glycoproteins

20 | P a g e

Glycoproteins are conjugate proteins formed by the covalent attachment of carbohydrate groups to

proteins. Most secreted eukaryotic proteins are glycoproteins including blood proteins such as

antibodies and hormones, milk proteins such as lactalbumin and proteins contained in lysosomes and

some of the proteins secreted by the pancreas such as ribonuclease. Besides, almost all membrane-

associated eukaryotic proteins are glycosylated. Indeed, protein glycosylation is more abundant than all

other types of posttranslational modifications combined. Glycosylation greatly increases the complexity

of the proteome. No generalization can be made about the effects of glycosylation on protein

properties; they must be experimentally determined on a case-by-case basis. In many cases the

functions of the carbohydrate moieties of glycoproteins remain enigmatic.

Nevertheless, it is becoming increasingly evident that oligosaccharides tend to extend from the surfaces

of proteins rather than participate in their internal structures owing to their hydrophilic character. Both

experimental and theoretical studies indicate that oligosaccharides have mobile and rapidly fluctuating

conformations which account for the difficulty in crystallizing glycoproteins. Structures of most

glycoproteins are unaffected by the removal of their associated oligosaccharides. Glycosylation can

affect protein properties in many ways, including protein folding, oligomerization, physical stability,

specific bioactivity, rate of clearance from the bloodstream, and protease resistance.

N-and O-linked glycoproteins

Oligosaccharides in glycoproteins can be linked by glycosidic bond formed either to the amide nitrogen

atom in the side chain of asparagine (termed an N-linkage) or to the oxygen atom in the side chain of

serine or threonine (termed an O-linkage). N-linked glycans are around 5-fold more common than O-

linked glycans. Sequence analyses of glycoproteins showed that the amide nitrogen of an Asn residue is

β-linked to NAG residue of an oligosaccharide only if Asn is part of an Asn-X-Ser or Asn-X-Thr sequence,

where X is any amino acid residue except Pro. N-linked glycans tend to attach to proteins at sequences

that form β-bends. All N-linked oligosaccharides have a distinctive pentasaccharide core containing

branched (Man)3 (NAG)2 residues. The innermost pentasaccharide core serves as a common foundation

for attachment of additional sugars to form a wide variety of N-linked oligosaccharide patterns found in

glycoproteins.

Conversely, O-linked polysaccharides tend to be clustered into segments of polypeptide chains rich in

Ser, Thr, Pro and other helix-breaking residues. Hence, O-linked glycans tend to attach to proteins at

sequences that form intrinsically disordered regions. The carbohydrates’ hydrophilic and steric

21 | P a g e

interactions further stabilize the extended conformations of heavily glycosylated regions. The most

common O-linked attachment involves the disaccharide core β-galactosyl-(1→3)-α-N-

acetylgalactosamine α-linked to the OH group of either Ser or Thr. All other hydroxyl-bearing amino acid

side chains such as Tyr, 5-hydroxy-Lys (Hyl) and 4-hydroxy-Pro (Hyp) occasionally form O-glycosidic

bonds. O-linked glycoproteins often have protective functions.

Synthesis

The protein moieties of glycoproteins are synthesized under genetic control. In contrast, the

carbohydrate moieties are enzymatically synthesized. Endoplasmic reticulum (ER) and Golgi complex

(GC) are the major organelles that play central roles in protein glycosylation and trafficking. The

polypeptide chain, synthesized by ribosomes attached to the cytoplasmic face of the ER, is either passed

into the lumen or inserted into the ER membrane. Proteins in the lumen of the ER and in the ER

membrane are transported to the Golgi complex. Synthesis of the carbohydrate chain for the N–linked

glycosylation begins in the ER and continues in the GC, whereas the O-linked glycosylation takes place

exclusively in the GC. Carbohydrate units of glycoproteins are altered and elaborated in the Golgi

complex.

The endoplasmic reticulum is the most extensive membrane in the cell that forms compartments for the

synthesis of lipids and proteins. The rough endoplasmic reticulum is studded with ribosomes that are

engaged in the synthesis of proteins that are either membrane-bound or destined for secretion. The ER

lumen has an oxidizing environment for the formation of disulfide bonds in proteins destined for

secretion. The smooth endoplasmic reticulum is the site of lipid synthesis. Dolichol phosphate is

specialized lipid molecule that contains about 20 isoprene (C5) units located in the ER membrane.

Isoprene units of dolichol phosphates are also key building blocks for many important biomolecules such

as steroids and secondary metabolites in all life forms. The terminal phosphate group of the dolichol

phosphate is the site of attachment and assembly of an oligosaccharide destined for attachment to the

asparagine residue of a protein. This activated (energy-rich) form of the oligosaccharide is subsequently

transferred to a specific asparagine residue of the polypeptide chain by an enzyme located on the

lumenal side of the ER.

The Golgi complex is a stack of flattened membranous sacs that functions as a major sorting center of

the cell for targeting proteins to lysosomes, secretory vesicles, and the plasma membrane. Different sets

of vesicles transfer proteins from the endoplasmic reticulum to the cis-face of the Golgi complex, from

22 | P a g e

one compartment of the Golgi complex to another, and from the trans-face of Golgi complex to target

sites. Proteins proceed from the Golgi complex to different target sites, according to signals encoded

within their amino acid sequences and three-dimensional structures. Complex oligosaccharides are

synthesized in Golgi complex through the action glycosyltransferases and activated sugar nucleotides,

such as UDP-glucose. Glycosyltransferases catalyze the formation of glycosidic bonds using

carbohydrates donated by activated sugar nucleotides.

Classification

There are three classes of glycoproteins. The first class, in which the protein constituent is the largest

component by weight, is simply referred to as glycoproteins. Many glycoproteins are components of cell

membranes, where they take part in cell adhesion such as the binding of sperm to egg and cell-cell

communication. The second class of glycoproteins comprises the proteoglycans. In proteoglycans, the

core protein component is conjugated to a particular type of polysaccharide called glycosaminoglycan

(GAG). The GAG moiety commonly makes up a much larger percentage by weight of the proteoglycan

and it determines the structures and biological activities of proteoglycans. The oligosaccharides in

glycoproteins are smaller and more structurally diverse than the glycosaminoglycans of proteoglycans.

The oligosaccharide portions of glycoproteins are less monotonous than the glycosaminoglycan chains

of proteoglycans. A third class of glycoproteins is the mucins (mucoproteins). Mucins are glycoprotein

components of mucus. They are abundant in saliva where they function as lubricants.

Glycoproteins

Glycoproteins are carbohydrate-protein conjugates in which the carbohydrate moieties are small but

structurally diverse. Glycoproteins have one or several oligosaccharides of varying complexity joined

covalently to a protein. They are found on the outer face of the plasma membrane, in the extracellular

matrix, and in the blood. Inside cells they are found in specific organelles such as Golgi complexes,

secretory granules, and lysosomes. They are rich in information, forming highly specific sites for

recognition and high-affinity binding by other proteins.

RNase B

One of the simplest glycoproteins is bovine pancreatic ribonuclease B (RNase B). The oligosaccharide

does not affect the native enzyme’s conformation, substrate specificity, or catalytic properties.

However, RNase A folds to its native state more slowly than does RNase B and tends to aggregate. This

23 | P a g e

suggests that the oligosaccharide functions similarly to a molecular chaperone, most likely by shielding a

hydrophobic patch on the protein surface.

Erythropoietin

Erythropoietin is a vital hormone present in the blood serum that has dramatically improved treatment

for anemia. Erythropoietin (EPO) is secreted by the kidneys and stimulates the production of red blood

cells. EPO is composed of 165 amino acids and is N-glycosylated at three asparagine residues and O-

glycosylated on a serine residue. The mature EPO is 40% carbohydrate by weight, and glycosylation

enhances the stability of the protein in the blood. Artificial EPO can be distinguished from natural EPO in

athletes by detecting differences in their glycosylation patterns through the use of isoelectric focusing.

Glycophorin A

Glycophorin A is one of the best-characterized membrane glycoproteins of red blood cells. It contains

60% carbohydrate by mass, in the form of 16 oligosaccharide chains covalently attached to the N-

terminal domain of the membrane protein. Fifteen of the oligosaccharide chains are O-linked to Ser or

Thr residues, and one is N-linked to an Asn residue. Plasmodium falciparum invades erythrocytes by

using glycan-binding protein to bind to the carbohydrate moiety of glycophorin A. Disrupting this

interaction is clinically significant.

Granulocyte–macrophage colony-stimulating factor

Human granulocyte–macrophage colony-stimulating factor (GM-CSF) is a 127-residue protein growth

factor that promotes the development, activation, and survival of the white blood cells known as

granulocytes and macrophages. It is variably glycosylated at two N-linked sites and five O-linked sites.

The lifetime of GM-CSF in the bloodstream increases with its level of glycosylation. However, GM-CSF

that is produced in E. coli and hence is unglycosylated (bacteria rarely glycosylate the proteins they

synthesize) has a 20-fold higher specific biological activity than does the naturally occurring

glycoprotein.

The sugar code

The sugar code refers to the use of carbohydrates as carriers of chemical information for molecular and

cell-cell interactions. All biological polymers can be reasonably assumed to be built from 4 different

nucleotide subunits, 20 different common amino acids and 20 different basic monosaccharide units.

24 | P a g e

However, monosaccharides can be assembled into an almost limitless variety of oligosaccharides, which

differ in the stereochemistry and position of glycosidic bonds, the type and orientation of substituent

groups, and the number and type of branches. Therefore, oligosaccharides are by far the richest

information molecules in the cell. Carbohydrates surpass proteins in information density by two orders

of magnitude.

Molecular recognition

Glycoproteins are important constituents of plasma membranes. Eukaryotic cells have a thick and fuzzy

coating of glycoproteins and glycolipids named the glycocalyx that prevents the close encounter among

cells. Many cell-surface receptor proteins have relatively short and presumably stiff O-glycosylated

regions that link membrane-bound domains to the functional cytosolic domains. This type of

arrangement extends the functional domains in a lollipop-like manner above the cell’s densely packed

glycocalyx. The oligosaccharide markers of membrane-bound and secreted glycoproteins mediate a

variety of intercellular and intercellular interactions.

Molecular tags

Cells tend to synthesize a large repertoire of a given glycoprotein, in which each variant species differs

somewhat in the sequences, locations, and numbers of its covalently attached oligosaccharides. The

many different glycosylated forms of a given protein with several potential glycosylation sites and

patterns are called glycoforms. Each glycoform can be generated only in a specific cell type, tissue type

or developmental stage. Thus, the species specific and tissue-specific distribution of glycoforms that

each cell synthesizes endows it with a characteristic spectrum of biological properties. Similar “ticketing”

mechanisms govern the compartmentalization (molecular zip codes) and degradation (molecular timers)

of glycoproteins within cells. Therefore, a variety of glycoforms for a given glycoprotein ensures that it

has a range of cellular distributions and lifetimes. The protein component of glycoforms is identical but

the composition of the carbohydrate component is highly variable. This phenomenon, which compounds

the difficulties in the purification and characterization of proteins with different patterns of PTM, is

known as microheterogeneity.

Molecular zip codes

Glycosylation functions as molecular zip code for identifying the particular spatial destination of a

protein during protein trafficking. Mannose-6-phosphate containing oligosaccharide marks newly

25 | P a g e

synthesized proteins in the Golgi complex for transfer to the lysosome. This marker is acquired in the

Golgi complex in a two-step process. First, GlcNAc phosphotransferase adds a phospho-N-

acetylglucosamine unit to the 6-OH group of a mannose. Next, an N-acetylglucosaminidase removes the

added sugar to generate a mannose 6-phosphate residue in the core oligosaccharide.

Molecular timer

Glycosylation functions as molecular timer for specifying the age of a particular protein. The residues of

Neu5Ac (a sialic acid) situated at the ends of the oligosaccharide chains of many plasma glycoproteins

such as ceruloplasmin protect them from uptake and degradation in the liver. Removal of the sialic acid

residues by the enzyme sialidase is one way in which the body marks “old” proteins for destruction and

replacement. Similar mechanism is apparently responsible for removing old erythrocytes from the

mammalian bloodstream.

Nutrient sensing

Glycosylation can also functions as molecular sensor for identifying the energy status of the cell.

GlcNAcylation is an especially important glycosylation reaction involving the covalent attachment of N-

acetylglucosamine (GlcNAc) to serine or threonine residues of cellular proteins when nutrients are

abundant. The concentration of GlcNAc reflects the active metabolism of carbohydrates, amino acids

and fats. Glucose signals carbohydrate availability, acetate signals fatty acid availability, nitrogen signals

protein availability. The combination indicates that nutrients are abundant. The reaction is catalyzed by

GlcNAc transferase. The GlcNAcylation sites are also potential phosphorylation sites. Hence, the O-

GlcNAc transferase and protein kinases may be involved in cross talk to modulate one another’s

signaling activity. Like phosphorylation, GlcNAcylation is reversible, with GlcNAcase catalyzing the

removal of the carbohydrate. Dysregulation of GlcNAc transferase has been linked to insulin resistance,

diabetes, cancer and neurological pathologies.

Lectins

The high density of information encoded by oligosaccharides provides the sugar code with essentially

unlimited number of unique “words” small enough to be read and decoded by carbohydrate-binding

proteins called lectins. Lectins (Latin: legere, to select) are a special class of carbohydrate-binding

proteins that recognize one or more specific monosaccharides with particular linkages to other sugars in

oligosaccharides, usually with exquisite specificity. They were first discovered in plants but are now

26 | P a g e

known to occur in all organisms. Major functions of animal lectins include adhesion, cell–cell recognition

and targeting of newly synthesized proteins to specific cellular locations. For cell-cell recognition, lectins

on the surface of one cell interact with arrays of monosaccharides displayed on the surface of another

cell. In vertebrates, oligosaccharide tags read by lectins govern the rate of degradation of certain

peptide hormones, circulating proteins, and blood cells.

Protein–carbohydrate interactions typically include multiple weak noncovalent interactions that ensure

specificity yet permit reversible binding. Interactions include multiple hydrogen bonds, which often

include bridging water molecules, and the packing of hydrophobic sugar faces against aromatic side

chains. The carbohydrate-binding specificity of a particular lectin is determined by the amino acid

residues that bind the carbohydrate. A molecule of lectin usually contains two or more carbohydrate-

binding sites. Many sugars have a more polar and a less polar side. The more polar side interacts with

the lectin by hydrogen bonds, while the less polar side undergoes hydrophobic interactions with

nonpolar amino acid residues. Each interaction is weak, but the composite produces high-affinity and

high specificity binding characterizing the unique flow of information central to many physiological

processes.

Classification

Lectins can be grouped into different classes. The C type (for calcium requiring) lectins found in animals

use calcium ions as a bridge for direct interactions with OH groups on mannose residues of the sugar. C-

type lectins function in cell–cell recognition and receptor-mediated endocytosis, which is a process by

which soluble molecules are bound to receptors on the cell surface and subsequently internalized.

Selectins are a family of plasma membrane lectins that bind to different components of the immune-

system. The L, E, and P forms of selectins bind specifically to carbohydrates on lymph-node vessels (L),

endothelium (E) and activated blood platelets (P), respectively. Selectins are members of the C-type

family of lectins. The L–lectins are readily available in the seeds of leguminous plants, and many of the

initial biochemical characterizations of lectins were performed on L-lectins. Although the exact role of

lectins in plants is unclear, they can potentially serve as potent insecticides. Other L-type lectins, such as

calnexin and calreticulin, are prominent chaperones in the eukaryotic endoplasmic reticulum that

facilitate the folding of other proteins.

Applications

27 | P a g e

Purification of carbohydrates and glycoproteins is a prerequisite for carbohydrate analysis. Immobilized

lectins have been extensively used for the purification of carbohydrates and glycoproteins using affinity

chromatography. Concanavalin A (ConA) from jack bean, wheat germ agglutinin (WGA) and Mannose-

binding protein A (MBP-A) are among the best characterized lectins. ConA specifically binds α-D-glucose

and α-D-mannose residues whereas WGA specifically binds β-N-acetylmuramic acid and α-N-

acetylneuraminic acid. WGA causes cells to agglutinate or clump together. MBP binds specifically to

high-mannose octasaccharide. Glycoproteins can also be labeled with lectins that have been conjugated

(covalently cross-linked) to ferritin, an iron transporting protein that is readily visible in the electron

microscope. Such experiments with lectins of different specificities and with a variety of cell types, have

demonstrated that the carbohydrate groups of membrane-bound glycoproteins are, for the most part,

located on the external surfaces of cell membranes.

Viral infection

Many pathogens gain entry into specific host cells by adhering to cell-surface carbohydrates. Influenza

virus, the causative agent of the respiratory tract infection, binds to sialic acid residues present on cell-

surface glycoproteins of the host cell. Hemagglutinin is the viral protein that binds to these sugars. Viral

infection involves binding to the target cell, endocytosis, budding and release. After binding, the virus is

engulfed by the cell and begins to replicate. Viral assembly results in the budding of the viral particle

from the cell. Another viral protein, neuraminidase (sialidase), cleaves the glycosidic bonds between the

sialic acid residues and the rest of the cellular glycoprotein, freeing the virus to infect new cells, and thus

spreading the virus. Inhibitors of this enzyme such as oseltamivir (Tamiflu) and zanamivir (Relenza) are

important anti-influenza agents. The carbohydrate-binding specificity of viral hemagglutinin may play an

important role in species specificity of infection and ease of transmission. Avian influenza H5N1 (bird flu)

is especially lethal and is readily spread from bird to bird but not to human.

Bacterial infection

Helicobacter pylori, the causative agent of gastric ulcers, adhere to the inner surface of the stomach by

interactions involving lectins on bacterial membrane and specific oligosaccharides of membrane

glycoproteins of the gastric epithelial cells. The H. pylori recognition is part of the type O blood group

determinant. Consequently, people of blood type O show several fold greater incidence of gastric ulcers

than those of type A or B. Similarly, Vibrio cholera enters intestinal cells by interactions involving cholera

28 | P a g e

toxin molecule and the o pentasaccharide of ganglioside GM1 from a membrane phospholipid on target

cells.

Blood groups

Some membrane lipids of eukaryotic cells have covalently bound carbohydrates in which a complex

oligosaccharide functions as the polar head group. Gangliosides are lipopolisacchrides of sialic acid

containing oligosaccharides that determines human blood groups. Blood groups are based on protein

glycosylation patterns. The human ABO blood groups illustrate the effects of glycosyltransferases on the

formation of glycoproteins. Each blood group is designated by the presence of one of the three different

carbohydrates, termed A, B, or O, attached to glycoproteins and glycolipids on the surfaces of red blood

cells.

These structures have in common an oligosaccharide foundation called the O (or sometimes H) antigen.

The A and B antigens differ from the O antigen by the addition of one extra monosaccharide, either N -

acetylgalactosamine (for A) or galactose (for B) through an α-1, 3 linkage to a galactose moiety of the O

antigen. Specific glycosyltransferases add the extra monosaccharide to the O antigen. Each person

inherits the gene for one glycosyltransferase of this type from each parent. The type A transferase

specifically adds N acetylgalactosamine, whereas the type B transferase adds galactose. The O

phenotype is the result of a mutation in the O transferase that results in the synthesis of an inactive

enzyme. These structures have important implications for blood transfusions and other transplantation

procedures.

Cancer

Some pathological conditions such as cancer alter the glycosylation patterns of membrane proteins.

Normal cells stop growing when they touch each other, a phenomenon known as contact inhibition.

Cancer cells, however, are under no such control and therefore form malignant tumors. There are

29 | P a g e

significant differences in the distributions of cell-surface carbohydrate between cancerous and

noncancerous cells.

In born errors of glycosylation

Errors in glycosylation can result in pathological conditions. There is an entire family of severe inherited

human diseases called congenital disorders of glycosylation (CDG). These pathological conditions reveal

the importance of proper modification of proteins by carbohydrates and their derivatives. I-cell disease

(also called mucolipidosis II), is a lysosomal storage disease. Lysosomes are organelles that degrade and

recycle damaged cellular components or material brought into the cell by endocytosis. In patients with I-

cell disease, lysosomes contain large inclusion bodies of undigested glycosaminoglycans and glycolipids.

These inclusion bodies are present because the enzymes normally responsible for the degradation of

glycosaminoglycans are missing from affected lysosomes.

The responsible enzymes contain a mannose-6-phosphate residue as a component of an N-

oligosaccharide that serves as the marker directing the enzymes from the Golgi complex to lysosomes. I

cell patients are deficient in the N-acetylglucosamine phosphotransferase catalyzing the first step in the

addition of the phosphoryl group; the consequence is the mistargeting of eight essential enzymes. The

enzymes are present at very high levels in the blood and urine since active enzymes are synthesized, but

in the absence of appropriate glycosylation, they are secreted instead of being exported to lysosomes. In

other words, in I-cell disease, a whole series of enzymes are incorrectly addressed and delivered to the

wrong location.

Proteoglycans

Proteoglycans are macromolecules comprising a core protein to which one or more glycosaminoglycan

chains are covalently attached. The protein components of proteoglycans such as aggrecan and

syndecan range in molecular weight from 40 to 200 kDa. The polysaccharide components of

proteoglycans, such as keratan sulfate and chondroitin sulfate, are called a glycosaminoglycan (GAGs).

Glycosaminoglycans (mucopolysaccharides) are large, linear polymers of repeating disaccharide units.

One of the two monosaccharides is either N-acetylglucosamine or N-acetylgalactosamine. The other is

often a uronic acid, usually D-glucuronic or L-iduronic acid. Hence, GAGS are heteropolysaccharides. At

least one of the hydroxyl groups of the amino sugar in the repeating unit is esterified with sulfate.

Consequently, GAGs are anionic with high density of carboxylate and sulfate groups. GAGs assume an

extended conformation in solution to minimize the repulsive forces among neighboring charged groups.

30 | P a g e

Many proteoglycans are secreted into the extracellular matrix, but some are integral membrane

proteins exposed at the cell surface. The main biological role of proteoglycans is the provision of

multiple binding sites, rich in opportunities for hydrogen bonding and electrostatic interactions. The

specific patterns of sulfated and nonsulfated sugar residues in GAGs generate specific molecular

recognition sites for a wide variety of ligands that bind to them by hydrogen bonding and electrostatic

interactions. GAGs also function as structural components and lubricants. They provide viscosity,

adhesiveness, and tensile strength to the extracellular matrix. The major glycosaminoglycan in animals is

hyaluronate. Others include chondroitin sulfate (CS), keratan sulfate (KS), dermatan sulfate (DS),

heparan sulfate (HS) and heparin. Other glycosaminoglycans differ from hyaluronate in two respects:

they are generally much shorter polymers and they are covalently linked to specific proteins as

proteoglycans. Chitin, found in exoskeletons, is also a glycosaminoglycan.

Hyaluronic acid

Hyaluronic acid molecules (also called hyaluronan) are composed of a large number of β (1→4)-linked

disaccharide units that consist of D-glucuronic acid and N-acetyl-D-glucosamine linked by β (1→3) bond.

At physiological pH, hyaluronic acid exists as hyaluronate anion that binds tightly to cations such as K+,

Na+, and Ca2+. The hyaluronate polyanion forms an extended, left-handed, single-stranded helix with 3

disaccharide units per turn stabilized by intramolecular hydrogen bonds.

Hyaluronate has structural features that suit its biological function. It is rigid and highly hydrated

molecule with high molecular weight and numerous mutually repelling anionic groups. Consequently,

hyaluronate solutions have a shear dependent viscosity with high tensile strength and elasticity. At low

shear rates, the hyaluronate molecules form tangled masses that greatly impede flow. At high shear

rates, the stiff rod like hyaluronate molecules tend to line up with the flow and thus offer less resistance

to it. This viscoelastic behavior makes hyaluronate solutions excellent biological shock absorbers and

lubricants. Hence, hyaluronate is an important GAG component of ground substance (the extracellular

matrix of cartilage and tendons), synovial fluid (the fluid that lubricates the joints), and the vitreous

humor of the eye. It also occurs in the capsules surrounding certain pathogenic bacteria.

Some proteoglycans can form aggregates in the extracellular matrix. These aggregates are formed by

association of a hundred or more molecules of the core protein aggrecan; all bound to a single, very long

molecule of hyaluronate. Each aggrecan molecule is decorated by many covalently bound chondroitin

sulfate and keratan sulfate chains. Link proteins situated at the junction between each core protein and

31 | P a g e

the hyaluronate backbone mediate the core protein–hyaluronate interaction. The β (1→4)-linkages of

hyaluronic acid and other GAGs are hydrolyzed by hyaluronidase. This enzyme occurs in sperm cells and

a variety of animal tissues, in some pathogenic bacteria and in snake and insect toxins. In sperm cells, it

hydrolyzes an outer glycosaminoglycan coat around the ovum allowing sperm penetration. In

pathogenic bacteria, it renders animal tissues more susceptible allowing bacterial invasion.

Chondroitin-4-sulfate

Chondroitin-4-sulfate molecules (Greek: chondros, cartilage) are composed of D-glucuronic acid and N-

acetyl-D-galactosamine-4-sulfate joined by β (1→3)-bond. The disaccharide units are linked by β (1→4)-

bond. When the N-acetyl-D-galactosamine moiety is sulfated at the C6 position, the copolymer is

chondroitin-6-sulfate. These two types of chondroitin sulfates occur separately or in mixtures depending

on the tissue. Chondroitin-4-sulfate is a major component of cartilage, tendons, ligaments, and the walls

of the aorta. It contributes to the tensile strength of these connective tissues. Chondroitin sulfate is

connected to a Ser residue in the core protein via typical trisaccharide linker. The xylose residue at the

reducing end of the linker is joined by its anomeric carbon to the hydroxyl of the Ser residue.

Dermatan-4-sulfate

Dermatan sulfate molecules (Greek: derma, skin) are polymers of L-iduronate and N-acetyl-D-

galactosamine- 4-sulfate joined by α (1→3)-bond. The disaccharide units are linked by β (1→4)-bond.

Dermatan-4-sulfate differs from chondroitin- 4-sulfate only by an inversion of configuration about C5 of

the β-D-glucuronate (GlcA) residues to form α-L-iduronate (IdoA). Enzymatic epimerization of these

residues occurs after the formation of chondroitin. The epimerization is usually incomplete, so dermatan

sulfate also contains glucuronate residues. Dermatan sulfate is prevalent predominantly in skin, in blood

vessels and heart valves. It contributes to the pliability of skin.

Keratan-6- sulfates

Keratan sulfate molecules (Greek: keras, horn) are polymers mainly of alternating β (1→4)-linked D-

galactose and N-acetyl-D-glucosamine-6-sulfate residues. The disaccharide units are linked by β (1→3)-

bond. Some disaccharides in these polymers contain small amounts of fucose, mannose, N-

acetylglucosamine, and sialic acid. Consequently, their sulfate content is variable and they are very

heterogeneous groups. Kerata-4-sulfates are present in cartilage, bone, cornea, as well as a variety of

horny structures formed of dead cells: hair, nails, horn, hoofs, and claws.

32 | P a g e

Heparin

Heparin sulfate molecules (Greek: hepar, liver) are variably sulfated polymers of mainly alternating α

(1→4)-linked L-iduronate-2-sulfate and N-sulfo-D-glucosamine-6-sulfate residues. The disaccharide units

are linked by α (1→4)-bond. Heparin contains primarily sulfated iduronic acid (IdoA) and a smaller

proportion of glucuronic acid (GlcA), and is generally highly sulfated and heterogeneous in length. It has

an average of 2.5 anionic sulfate groups per disaccharide unit, which makes it the most negatively

charged polyelectrolyte of any known biological macromolecule. Unlike other GAGs, heparin is not a

constituent of connective tissue, but occurs almost exclusively in the intracellular granules of a type of

leukocytes called mast cells. Mast cells are found near the walls of arterial blood vessels and on the

surfaces of endothelial cells especially in the liver, lungs, and skin. They release heparin and other

molecules from their dense granules into the extracellular space when triggered.

Heparin acts as an anticoagulant by increasing the rate of formation of irreversible complexes between

antithrombin III and thrombin, a serine protease essential for blood clotting. The strongly electrostatic

heparin binding causes antithrombin III to bind to and inhibit thrombin. Purified heparin is routinely

added to blood samples clinically as an anticoagulant to assist the termination of blood clotting.

Plasmodium falciparum, the parasitic protozoan that causes malaria, relies on glycan binding to infect

and colonize its host. Glycan-binding proteins of the parasitic form initially injected by the mosquito bind

to the glycosaminoglycan heparin sulfate on the liver, initiating the parasite’s entry into the cell.

Heparan sulfate

Heparan sulfate is initially synthesized as a long polymer of alternating N-acetylglucosamine (GlcNAc)

and glucuronic acid (GlcA) residues. This simple chain is then enzymatically modified at specific regions.

First, some of the acetyl groups of the GlcNAc residues are replaced with sulfates by N-deacetylase: N-

sulfotransferase. This enzyme generates clusters of N-sulfated glucosamine (GlcN) residues. Second,

these clusters attract an epimease which converts GlcA to IdoA and sulfotransferases that produce

sulfate esters at the C2 hydroxyl of IdoA and the C6 hydroxyl of N-sulfated GlcN. The resulting heparan

sulfate molecules are polymers of alternating α (1→4)-linked D-iduronate and N-acetyl-D-glucosamine-

6-sulfate residues. The disaccharide units are linked by α (1→4)-bond. Heparan sulfate has highly

sulfated (S) domains alternating with domains having unmodified GlcNAc and GlcA residues (NA

domains).

33 | P a g e

Heparan sulfate is structurally similar to heparin but has a higher proportion of N-acetyl groups and

fewer N-and O-sulfate groups, arranged in a less regular pattern. The high density of negative charges in

heparin sulfate brings positively charged molecules of lipoprotein lipase into the vicinity and holds them

by electrostatic interactions as well as by sequence specific interactions with S domains. Heparan sulfate

is a ubiquitous component of cell surfaces as well as an extracellular matrix (basal lamina) in blood

vessel walls, epithelial cells and brain. Extracellular ground substance is a sheet like substance that

separates organized groups of cells. Epithelial cells are the cells lining body cavities and free surfaces.

Heparan sulfates in proteoglycans bind to a variety of extracellular ligands and growth factors and

thereby modulate the ligands’ interaction with specific receptors of the cell surface. Growth factors are

proteins that function to induce the growth, differentiation, and migration of their specific target cells.

They are expressed in specific spatial and temporal patterns in embryos and adults. The S domains bind

specifically to extracellular ligands and signaling molecules to alter their biological activities. The change

in activity may result from three different mechanisms of action. The first mechanism is conformational

change induced by binding to S domains. The second mechanism is by the brokering action of S

domains. Brokering is the ability of adjacent domains of heparan sulfate to bind to two different

proteins and enhance protein-protein interactions by bringing them into close proximity. A third

mechanism is by acting as coreceptor. The S domains bind to signaling molecules electrostatically to

maintain their high local concentrations and enhance their interaction with receptors on cell surfaces.

Extracellular matrix

In multicellular animals, the extracellular space surrounding cells inside tissues is filled with a gel-like

material called the extracellular matrix or the ground substance. It holds the cells together and provides

a porous pathway for the diffusion of nutrients and oxygen to individual cells. It is composed of an

interlocking meshwork of proteoglycans and fibrous proteins. Connective tissues such as cartilage,

tendon, skin, and blood vessel walls are composed of fibrous proteins such as collagen, elastin,

fibronectin, and laminin embedded in the ground substance. Some of these proteins are multiadhesive,

a single protein having binding sites for several different matrix molecules. Proteoglycans in the

extracellular matrix act as tissue organizers, influence the development of specialized tissues, and

regulate the extracellular assembly of collagen fibrils.

The proteoglycan aggrecan and the protein collagen are the major components of cartilage which is

made largely of cross-linked meshwork collagen fibrils and proteoglycans. Aggrecans interact strongly

34 | P a g e

with collagen in the extracellular matrix of cartilage, contributing to the development and tensile

strength of this connective tissue. However, the characteristic resilience of cartilage results from its high

proteoglycan content. The triple helix of collagen provides structure and tensile strength, whereas

aggrecan serves as a shock absorber. Furthermore, GAGs are important components of synovial fluid

(the fluid that lubricates the joints), and the vitreous humor of the eye.

Aggircan is one of the best characterized members of the proteoglycan in the extracellular matrix. The

protein component of aggrecan is a very large molecule. It has three globular domains named N-

terminal domain (domain 1), central domain (domain 2) and C-terminal domain (domain 3). Domain 1 is

hyaluronic acid–binding domain since it binds noncovalently to hyaluronic acid. This attachment is

stabilized by a link protein. Domain 2 has no particular known function. Domain 3 contains a lectin like

module, which binds certain monosaccharide units. The highly extended region between globular

domains 2 and 3 is the site of glycosaminoglycan attachment. This linear region contains highly

repetitive amino acid sequences, which are sites for the covalent attachment of keratan sulfate and

chondroitin sulfate.

The core protein has three carbohydrate binding regions. These are an N-terminal (inner) region,

oligosaccharide rich (central) region and a C-terminal (outer) region. The N-terminal region binds a

relatively few oligosaccharides predominantly through the amide N atoms of specific Asn residues. It

overlaps with the globular domain 1. The central region serves as anchor point for keratan sulfate chains

through the side chain O atoms of Ser and Thr residues. It overlaps with the globular domain 2. The C-

terminal region mainly binds to chondroitin sulfate chains through the side chain O atoms of Ser

residues in Ser-Gly dipeptides via galactose– galactose–xylose trisaccharide linker.

Many proteoglycans can form huge complexes. Aggrecans have a bottlebrush-like supramolecular

architecture. Many molecules of aggrecan subunits “bristle” are noncovalently attached through their

first globular domain to a very long, central and filamentous hyaluronic acid “backbone”. Many aggrecan

monomers emerge laterally at regular intervals from opposite sides of a central hyaluronate filament.

Each aggrecan chain (projection) is made up of a core protein to which to several bushy chains of

keratan sulfate and chondroitin sulfate are covalently bound protrusions. This heterogeneous assembly

accounts for the enormous molecular masses of aggrecans and for their high degree of polydispersity

(range of molecular masses).

35 | P a g e

A large amount of water is attracted and absorbed into the many negative charges of GAGs. Aggrecans

can cushion compressive forces because of this absorbed water. Water is squeezed from GAGs when

pressure is exerted and returns to GAGS when pressure is released. The application of pressure on

cartilage squeezes water away from these charged regions until charge–charge repulsions prevent

further compression. Water enables aggricans to spring back after having been deformed generating

cushioning effect. Cartilage in the joints, which lack blood vessels, is nourished by this flow of liquid

caused by body movements. This explains why long periods of inactivity cause joint cartilage to become

thin and fragile. The most common form of arthritis is osteoarthritis. It results from the loss of water

from proteoglycan with aging. Other forms of arthritis can result from the proteolytic degradation of

aggrecan and collagen in the cartilage. Mucopolysaccharidoses are a collection of diseases, such as

Hurler disease, that result from the inability to degrade glycosaminoglycans. They result in skeletal

deformities and reduced life expectancies.

Adhesion

Proteoglycans not only function as lubricants and structural components of the extracellular matrix, but

they also mediate adhesion of cells to the extracellular matrix, and bind factors that regulate cell

proliferation. Some proteoglycans provide points of adhesion, recognition, and information transfer

between cells, or between the cell surface and the extracellular matrix. Glycosaminoglycan-containing

macromolecules are also found on the cell surfaces. Proteoglycans at cellular surface mediate the

activities of various growth factors. The core proteins of membrane bound proteoglycans can be

grouped into three classes. These are integral membrane proteins, integral membrane lipoproteins and

peripheral membrane or extracellular matrix proteins.

Syndecan is a core protein and an integral membrane protein. It has an amino-terminal extracellular

domain and a single transmembrane domain. The amino-terminal domain on the extracellular surface is

covalently attached to three chains of heparin sulfate and two chains of chondroitin sulfate, each

attached to a Ser residue by trisaccharide linkers. It is anchored to the plasma membrane by a single

transmembrane helix. Glypicans are integral membrane lipoprotein core proteins. They are anchored to

the plasma membrane by a glycolipid, a derivative of the membrane lipid phosphatidylinositol.

Fibronectin is a peripheral multidomain core protein that can be released into the extracellular space

where it forms part of the basement membrane.

Molecular recognition

36 | P a g e

Molecular recognitions between cells and proteoglycans of the extracellular matrix are mediated by

integral membrane proteins and extracellular matrix proteins. The overall cell-matrix interactions serve

not merely to anchor cells to the extracellular matrix but also to provide paths that direct the migration

of cells in developing tissue. Integrins are a family of integral membrane proteins that act both to attach

cells to extracellular matrix and to mediate signaling between the cell interior and the extracellular

matrix. They are heterodimeric proteins in which each subunit is anchored to the plasma membrane by

a single hydrophobic transmembrane helix. The large extracellular domains of the heterodimer combine

to form a specific binding site for extracellular proteins such as collagen and fibronectin. Integrins have

binding sites for a number of intracellular and extracellular macromolecules giving them the ability to

convey information in both directions across the plasma membrane.

Integrins interact with macromolecular components of the extracellular matrix and convey instructions

on adherence to the matrix or cell migration to the cytoskeletal system. The most common sequence

determinant for integrin binding to extracellular proteins is RGD (Arg–Gly–Asp). Integrins regulate many

processes including platelet aggregation, tissue repair, the activity of immune cells, and the invasion of

tissue by a tumor. Fibrinogen is converted into fibrin during blood clotting at the site of wound.

Integrins are also used to anchor peripheral proteins such as fibronectin to actin filaments inside the

membrane. Fibronectin is an extracellular protein with binding sites for both integrins and

proteoglycans in the extracellular matrix. It has separate binding domains for different GAGs such as

heparan sulfate and fibrous proteins such as fibrin and collagen.

Fibroblast growth factor (FGF)

FGF is an extracellular signaling protein that stimulates cell division. It regulates a variety of critical

biological processes through the four FGF cell-surface receptors (FGFR1–4). FGF only binds to FGFRs in

complex with heparin or with heparan sulfate moieties of syndecan in cell surfaces of the target cells.

Heparan sulfate is joined to syndecan through a trisaccharide bridge commonly at a Ser residue in the

general sequence Ser–Gly–X–Gly. FGFR dimerization in solution requires the presence of heparan sulfate

proteoglycans in addition to FGF. The binding of FGF to heparin or heparan sulfate protects FGF from

degradation. The release of active FGF–FGFR complexes from the extracellular matrix by the proteolysis

or by the partial degradation of heparin sulfate could be important activation mechanism. Several other

growth factors interact similarly with proteoglycans.

Infection

37 | P a g e

Electrostatic and sequence specific interactions with S domains are central in the first step in the entry

of certain viruses (such as herpes simplex viruses HSV-1 and HSV-2) into cells. Lectins on the surface of

HSV-1 and HSV-2 (the causative agents of oral and genital herpes, respectively) bind specifically to

heparan sulfate on the cell surface as a first step in their infection cycle. Infection requires precisely the

right pattern of sulfation on heparan sulfate polymer. Sequence heterogeneity generates different

ability to bind to specific proteins.

Cell walls

Bacteria are surrounded by rigid cell walls. Proteoglycans are important components bacterial and

fungal cell wall. They can be classified as gram-positive and gram-negative depending on whether or not

they take up gram stain. Gram-positive bacteria such as Escherichia coli and Salmonella typhimurium

have a thick cell wall surrounding their plasma membrane. Their cytoplasm is surrounded by plasma

membrane and peptidoglycan cell wall. In contrast, gram-negative bacteria have a thin cell wall covered

by a complex outer membrane. Their cytoplasm is surrounded by plasma membrane, periplasmic space,

peptidoglycan cell wall and outer membrane. The periplasmic space is an aqueous compartment that

lies between the plasma membrane and the peptidoglycan cell wall. It contains proteins that transport

sugars and other nutrients. The outer membrane functions as a barrier to exclude harmful substances

(such as gram stain). Characteristic antigens of bacteria, which are responsible for bacterial virulence,

are components of bacterial cell walls and capsules.

Peptidoglycans

Atomic force microscopy AFM is an imaging technique based on the variation in the force between a

probe that is several nanometers in diameter and a surface of interest as the probe is scanned over the

surface. It can be used to study the chemical structure bacterial cell walls. Bacterial cell wall is

hierarchicaly organized into peptidoglycan repeating unit, peptidoglycan chain and peptidoglycan cable.

Peptidoglycan chains are proteoglycans. The cell walls of both gram-positive and gram-negative bacteria

are made up of peptidoglycan framework that completely encases the cell. Peptidoglycans or mureins

are covalently linked polysaccharide and polypeptide chains.

The peptidoglycan repeating unit of a single peptidoglycan chain is a disaccharide of β (1→4)-linked N-

acetylglucosamine (NAG) and N-acetylmuramic acid (NAM) and a tetrapeptide of L-Ala-D-isoglutamyl-L-

Lys-D-Ala. The disaccharide units are linked by β (1→4)-bond. The lactyl side chain of NAM forms an

amide bond with the branching tetrapeptide chain. The D-amino acids of peptidoglycans render them

38 | P a g e

resistant to proteases. However, lysozyme catalyzes the hydrolysis of the β (1→4) glycosidic linkage

between NAM and NAG. It is found in tears, mucus, and other vertebrate body secretions, as well as in

egg whites.

A long and linear chain of alternating NAG–NAM disaccharides form a monotonous polysaccharide to

which short peptide branches are covalently attached. Multiple neighboring parallel peptidoglycan

chains are covalently cross-linked by a short connecting bridge through their tetrapeptide side chains to

form a right-handed helical peptidoglycan cable. The most common connecting bridge is pentaglycine

(Gly5) bridge that extends from the terminal carboxyl group of one tetrapeptide to the ε-amino group of

the Lys in a neighboring tetrapeptide. This bridge may sometimes contain different amino acid residues

such as Ala or Ser. Several helical peptidoglycan cables wrap about the plasma mebrane to form the

rigid framework of a bacterial cell wall.

Penicillin

In 1928, Fleming noticed that the chance exposure of growing bacterial culture plate to the mold

Penicillium notatum results in lysis of nearby bacteria. Penicillin kills bacteria by disrupting the normal

balance between cell wall biosynthesis and degradation. Penicillin contains a thiazolidine ring fused to a

β-lactam ring. It specifically binds to and inactivates enzymes that function to cross-link the

peptidoglycan strands of bacterial cell walls. Most bacteria that are resistant to penicillin secrete a β-

lactamase (penicillinase) enzyme, which inactivates penicillin by hydrolytically cleaving the amide bond

of its β-lactam ring to form penicillinoic acid.

Teichoic acids

The outer surfaces of gram-positive bacteria are covered by teichoic acids. Teichoic acids are polymers

of glycerol or ribitol linked by phosphodiester bridges often terminating in lipopolysaccharides. The

hydroxyl groups of this sugar–phosphate chain are substituted by D-Ala residues and saccharides such as

39 | P a g e

glucose or NAG. Teichoic acids are anchored to the peptidoglycans via phosphodiester bonds to the C6-

OH groups of their NAG residues.

O-Antigens

Lipopolysaccharides are the dominant surface feature of the outer membrane of gram-negative

bacteria. The outer membranes of gram-negative bacteria are composed of complex

lipopolysaccharides, proteins, and phospholipids that are organized in a complicated manner. They are

decorated with complex and often unusual polysaccharides known as O-antigens that uniquely mark

each bacterial strain and elicit immunological defense system by the host. O-antigens are subjected to

rapid selection pressure for mutational alteration as part of the ongoing biological warfare between

pathogen and host. The mutations are in the genes specifying the enzymes that synthesize the O-

antigens so as to generate new bacterial strains that are not recognized the by the host.

Chitin

Chitin is a homopolymer of β (1→4)-linked N-acetyl-D-glucosamine residues. Hence, it is unbranched

glycosaminoglycan. Chitin is the principal structural component of the exoskeletons of invertebrates

such as insects, crustaceans, and arachnids. The razor sharp beaks of squid used for disabling and

consuming prey are made of extensively cross-linked chitin. Chitin is also a major cell wall constituent of

most fungi and many algae.

Mucins

Mucins or mucoproteins are O-linked glycoproteins possessing often sulfated and hence mutually

repelling carbohydrate chains of N–acetylgalactosamine. They are synthesized by specialized cells in the

tracheobronchial, gastrointestinal, and genitourinary tracts making them common in mucous secretions.

They play important roles in cell adhesion, the immune response and fertilization. At their physiological

concentrations, mucins form entangled networks that function as a protective barrier and lubricants to

various epithelial cells. The defining feature of the protein component of mucins is a region of the

protein backbone termed the variable number of tandem repeats region (VNTR). VNTR is rich in serine

and threonine residues that are extensively O-glycosylated at serine or threonine residues. High

glycosylation of the VNTR renders mucins into an extended conformation. The Cys-rich domains and the

D domain facilitate the polymerization of many molecules. Mucins can be membrane-bound or

40 | P a g e

secreted. In the evolutionary struggle between pathogens and their hosts, mucins have evolved to

contain the target oligosaccharides of certain pathogens to function as decoys.

Carbohydrate analysis

Carbohydrate analysis involves the development of methods for analyzing the structure,

stereochemistry and function of complex oligosaccharides. Unlike nucleic acids and proteins,

oligosaccharides can be branched and joined by a variety of linkages complicating carbohydrate

hydrolysis. Carbohydrate analysis requires four basic processes. First, a target glycoprotein or

lipopolysaccharide is purified from its natural or heterologous source. Second, oligosaccharides are

removed from their protein or lipid conjugates for further analysis. Third, the purified oligosaccharides

are hydrolyzed in strong acid to determine the relative compositions of the various monosaccharides.

Fourth, oligosaccharides are subjected to stepwise degradation with specific reagents that reveal the

position and stereochemistry of glycosidic bonds.

Highly purified lectins, attached covalently to an insoluble support, are useful reagents for detecting and

separating glycoproteins. Once the target glycoprotein is purified, the points of attachment and the

structures of its oligosaccharides can be systematically determined. Characterization of an

oligosaccharide requires elucidating the identities, anomers, linkages, and the order of its component

monosaccharides. The next step is to detach the oligosaccharide moieties from its protein and lipid

conjugate using purified glycosidases and lipases respectively. N-linked oligosaccharides can be released

from proteins by peptide N-glycosidase F, which cleaves the N–glycosidic bonds.

Large oligosaccharides can also be converted into smaller, easily analyzable oligosaccharides chemically

or with sequence specific endoglycosidases. Hydrolysis of oligosaccharides yields a mixture of

monosaccharides that can be modified for chromatographic separation, identification and

quantification. Establishing the complete structure of an oligosaccharide requires determination of

branching positions, the sequence in each branch, the configuration of each monosaccharide unit at

anomeric and other carbons, and the positions of the glycosidic links. Oligosaccharides can be

“sequenced” by a combination of methylation analysis and enzymatic degradation supported by mass

spectrometry and high-resolution NMR spectroscopy.

Methylation analysis

41 | P a g e

Methylation analysis is the process of converting all free hydroxyls to acid-stable methyl ethers by

treating the intact oligosaccharide with methyl iodide in a strongly basic medium to locate glycosidic

bonds. Exhaustive methylation is followed by hydrolysis. The methylated intact oligosaccharide is

subsequently hydrolyzed in acid. Unlike glycosidic bonds, methyl ethers not at the anomeric C atom are

resistant to acid hydrolysis. Consequently, if an oligosaccharide is exhaustively methylated and then

hydrolyzed, the free OH groups on the resulting methylated monosaccharides mark the former positions

of the glycosidic bonds. Methylated monosaccharides are often identified by gas–liquid chromatography

coupled to mass spectrometry.

Stepwise enzymatic degradation

Stepwise enzymatic degradation is the process of using exoglycosidases of known specificity to remove

residues one at a time from the intact oligosaccharide to determine the sequence of monosaccharide

units and potential branching points. Exoglycosidases are enzymes that specifically hydrolyze specific

monosaccharides from the nonreducing ends of oligosaccharides (analogous exopeptidases). The

enzyme β-1, 4-galactosidase cleaves the terminal β-glycosidic bond exclusively at galactose residues,

whereas α-1, 2-mannosidase does so with the α-anomers of mannose. Cleaving the intact

oligosaccharide with exoglycosidases of varying specificities often allows the deduction of the position

and stereochemistry of the linkages. The repetition of this process with the use of an array of enzymes

of different specificity will eventually reveal the sequence and configuration of anomeric carbons of the

oligosaccharide. The hydrolysis products can again be analyzed by mass spectrometry. However, the

processing enzymes are generally not available in sufficient purities to ensure the synthesis of uniform

products.

Mass spectrometry and NMR

Matrix-assisted laser desorption/ionization/ time-of-flight (MALDI-TOF) or other mass spectrometric

techniques (MALDI MS) are very sensitive methods for determining the mass of the molecular ion (the

entire oligosaccharide chain). Tandem mass spectrometry (MS/MS) reveals the mass of the molecular

ion and many of its fragments, which are usually the result of breakage of the glycosidic bonds. Although

all sugar isomers have identical molecular masses, they have characteristic fragmentation patterns.

NMR analysis of oligosaccharides of moderate size can provide detailed information about sequence,

linkage position, and anomeric carbon configuration.

Documents

Structure and function carbohydrates SubtopicsA few types of monosaccharide units can be joined to form a large variety of oligosaccharides and polysaccharides. The most abundant oligosaccharides