EDIOL_2009MAY08_MCP_TA_01

8/6/2019 EDIOL_2009MAY08_MCP_TA_01

1/4

Makoto Mizuno

Senior Engineer of MCU Development

Dept.1

MCU Technology Div., MCU Business

GroupRenesas Technology Corp.

Prior to the advent o the on-chip

ash memory microcontroller(ash MCU), mask ROM was used

to store program code in micro-

controllers that stored programs

internally. Since modiying the

program required modiying the

mask in a mask ROM MCU, devel-

opment times and development

costs were serious issues in maskROM microcontrollers. The ash

MCU, which is replacing mask

ROM with ash memory, has

been evolving recently becauseit helps shorten system devel-

opment time and cost radically

since users can re-program the

embedded ash memory even

ater the MCU has been mounted

on the product.

The trend o embedded ash

memory technology which sup-

plies program code to a CPU inmicrocontrollers has developed

with dierent rom general-pur-

pose ash memories.In addition, microcontrollers

are subject to strong demands

or real-time perormance rom

embedded equipment applica-

tion users. The general-purpose

ash memory, such as NAND ash

memory, used or le storage is

aimed at low cost and large ca-

pacity applications. The technol-ogy trend in NAND ash memory

is or importance to be placed on

large number o rewrite cyclesand high through rates in burst

reading. There is little pressure to

reduce latency times. In contrast,

since supply o program code is

the main use or the ash memory

in ash MCUs, NOR ash memory

is the mainstream here, and com-

patibility with the logic circuits

with which it is combined is im-

portant or CPU perormance.

In this article, we will rst

discuss the background or the

required characteristics and theprevious technology trends and

then we will discuss the new

requirements o the next genera-

tions o this technology. Finally, we

will tackle the technological issues

that must be resolved to achieve

the requirements presented.

Previous tech trends

Until now, not only microcon-

trollers but most integrated

circuits have ollowed Moores

Law and have evolved greatlyat each technology generation

due to ever smaller eature sizes.

What has driven this evolution

is the need or improved cost-

perormance ratios. The most im-

portant aspect o perormance in

microcontrollers is computational

perormance. In the remainder

o this section, we discuss ash

memory technology rom the

standpoints o cost and compu-

tational perormance.

CostAt each technologygeneration, ash microcontrollers

can be divided into three cat-

egories: high-end products that

ocus on perormance, generic

products that balance cost and

perormance, and cost-conscious

products that ocus on cost.

The cost o a microcontroller is

strongly dependent on the chip

(silicon) area required: in general,

generic products have a chip area

about one hal that o high-end

Figure 1: Trends in high-end MCU ash capacity.

Figure 2: Trends in high-end MCU perormance.

eetindia.com | EE Times-India

Technology trends in ash MCUsMICROCONTROLLER
http://www.eetindia.co.in/http://www.eetindia.co.in/

8/6/2019 EDIOL_2009MAY08_MCP_TA_01

2/4

products, and cost-conscious

products have a chip area aboutone hal that o generic products.

The evolution o semiconduc-

tor technology usually achieves

a doubling o both gate density

and on-chip memory capacity at

each generation (fgure ). This

means that the generic productso the next generation usually

achieve the perormance o the

high-end products o the previ-

ous generation, and the cost-con-scious products o the ollowing

generation also achieve that level

o perormance. It is vital or the

cost aspect o these technology

trends that the same perormance

be achieved in one hal the area at

each generation due to reduced

eature sizes. Since it is extremelydifcult to achieve a signicant

reduction (that is, a 50 per cent

reduction) in area with just circuit

design eorts, easy compatibil-ity with smaller eatures sizes will

become a critical point or ash

memory technology in the uture.

Computational performance

Computation perormance is

increased by increasing the clock

requency and by improvements

in CPU architecture. High-end

products have achieved a 20-oldincrease in perormance over the

last 10 years, and the requirement

to continue this trend is to achievea actor o 2.5 improvement at

each technology generation (ig-

ure 2).

Although it is necessary to pro-

vide an amount o program code

storage capacity appropriate or

the CPU perormance level in the

ash memory (code ash memo-

ry) that holds the program code(instructions, parameters, and

data), rewriting o this memory

is not required. In contrast, sincethe ash memory write speed is

extremely slow compared to the

clock speed or data used in CPU

calculations, data is provided to

the CPU rom registers or SRAM.

Portions o data that need to be

saved temporarily can be written

asynchronously to ash memory

or EEPROM. Thereore, while the

code ash memory in which read

perormance is the main concern

is seen as the most important mi-

crocontroller ash memory, thereis also some demand or data stor-

age ash memory.

Throughout the period up tothe point microcontroller clock

requencies reached about 40

MHz (the 0.35 m generation),

the CPU and the ash memory

were able to operate at the same

clock requency due to increases

in basic device perormance due

to the decreasing eature size

at each generation. However,

when the CPU clock requency

is 60 MHz or higher, it becomes

extremely difcult to increase

ash memory read speeds byextending existing technologies.

At this point, the microcontroller

with cache memory appearedwhich wasnt need or ash read

speed improvements to meet

the perormance requirements.

Both these microcontrollers and

ones that adopted high-speed

ash memory were able to meet

the computation perormance

requirements at that generation.

However, now with the increasing

severity o the problems associat-

ed with the increasing dierence

between the operating speeds o

the CPU and ash memory, theincreasing complexity o cache

memory systems, and the increas-

ing power consumption due tothe use o multiple CPU cores,

the need to reconsider the tech-

nologies that give priority to both

power consumption and latency

in microcontroller is getting to be

important.

Next gen requirements

It is now necessary to improve

microcontroller unctionality and

perormance along with the evo-

lution o semiconductor technol-

Figure 3: Logic capability regarding Flash technology.

Table 1: General trends in high-end MCU.

2 eetindia.com | EE Times-India

8/6/2019 EDIOL_2009MAY08_MCP_TA_01

3/4

ogies to respond to user require-

ments. Basically, any elementaltechnology has only a limit range

o applicability, it is important to

judge the applicability o a tech-

nology at an appropriate time.

In the ollowing, we will discuss

three major requirements that

have come to light along withthe progress rom earlier times to

the current technological state.

Reducing power consump-

tionAlthough the power con-sumption per unit area is tending

to increase with each generation,

recently the amount o that in-

crease has tended to be on the

order o 1.3 to 2.0 times due to the

increasing difculty o Vdd scaling.

Thereore, there have been cases

where microcontroller computa-tional perormance has been lim-

ited by power consumption, and

technologies that suppress power

consumption have become im-portant elements or increas-

ing perormance. At the same

time, cache memories, which

are responsible or perormance

improvements, are occupying

40 per cent o overall CPU power

consumption, and technologies

that simpliy cache systems are

now required.Handling multiple CPU cores

Parallelism based on multiple CPU

cores has become indispensableto improve computational per-

ormance. Since it has become

extremely difcult to achieve the

desired 2.5-old perormance

increase at each generation with

improvements in pipelining and

higher clock rates in individual

CPUs, we now design perormance

improvements with multiple CPUparallelism and clock speed op-

timization. In parallel-processing

systems, the CPU stall penalty islarge, and technologies such as

more complex branch prediction

and minimizing latency will be-

come increasingly important.

EEPROM functionality

EEPROM has come to be used

to store small amounts o data

on the system board. Due to

the recent trend towards imple-

mentation o applications as

SoC (system-on-chip) devices,

there are now increasing needs

or microcontrollers that include

EEPROM unctionality. However,

since the EEPROM cell structure

diers rom that o the ash

memory cell due to the empha-

sis on write perormance, there

are technological problems with

combining EEPROM on the samechip. Also, there will be needs or

including multiple, independent

EEPROMs in these devices startingin 2010, and it will be necessary to

achieve simultaneous read and

write operations, that is, methods

or ameliorating intererence will

be required.

Next generation fash MCU

We will summarize the develop-

ment issues or the next genera-tion o ash memory technolo-

gies and propose directions or

developing the next generationo ash-MUCs as shown in Table

1.

CostReduced eatures sizes

and usion with logic abrication

processes

Microcontroller performance

Cache simplication, latency re-

duction, ast and low power read

operations

Achieving SoC requirements

Implementing EEPROM emulation

(data ash) and operating multiple

modules at the same time

CostMatching with process

eature size reduction. Core and

logic supply voltages are alling

due to the reductions in eature

sizes. However, since current NOR

ash memory requires a readword voltage in the 2.5 to 4.0

V range, this technology is at a

disadvantage in both speed andpower consumption. In contrast,

the read voltage can be reduced

with the split-gate ash memory

cell in which the read and write

circuits can be separated as shown

in Figure 3. Furthermore there are

excellent eatures which were di-

cult to solve or the conventional

NOR-type ash cell such as avoid-ing the over-erasure problem and

improving write efciency, and

thus has eatures that make itadvantageous or microcontroller

embedded ash. And now, it is

going to become a wider range

where it can be applied. Focusing

on memory element, two types

o device have been proposed or

the memory element the oating

gate type and the charge trap

type, and 100 MHz read operation

has been achieved by using the

charge trap type device. Thereore

we see the split-gate cell as prom-

ising or use in microcontroller

ash memory.

Microcontroller peror-

manceSelecting between

cache-dependent and ash-de-

pendent approaches

In microcontrollers that do

not include a cache memorysystem, microcontroller peror-

mance directly depends on the

ash memory operating re-quency. However, when a cache

is used the CPU computational

perormance is strongly depen-

dent on the cache. There are

two candidates or designing a

ash memory architecture that

includes cache: complex cache

system + slow ash (case 1) and

simple cache + ast ash (case 2). The dierences between these

approaches appears as a dier-

ence in power consumption atthe same computational per-

ormance, and dierences also

appear in the complexity o the

cache control circuit and in the

cache memory capacity (includ-

ing hierarchical structures). Since

the cache miss penalty is critical

in case 1, case 1 devices are de-

signed so that misses, including

branch misses, do not occur, and

case 2 devices are designed so

Figure 4: Cache simplication study.

eetindia.com | EE Times-India

8/6/2019 EDIOL_2009MAY08_MCP_TA_01

4/4

that the cache miss penalty isminimized with high-speed ash.

As a result there is a tendency

or case 1 device to require large

complex cache systems. When

a branch instruction is read, the

power consumption diers de-

pending on whether or not thereis code in the cache because the

numbers o transistors are di-

erent or same code execution

as shown in Figure 4 and alsodepending on the cache hit e-

ciency. Then, i the code ash

is not made aster than 20 or 30

MHz, a ash memory access may

require 8 cycles i the CPU oper-

ating requency is 200 MHz or

higher. In an application in which

branch prediction is difcult andthere is a ash access once every

5 to 7 system clocks, CPU stall

countermeasures become more

complex in case 1. A study o theinuence on power consump-

tion and chip area, we learned

that the case 2 design is advanta-

geous rom both aspects (fgure

4). Thereore in the uture, in the

age o the increasingly pipe-

lined and increasingly parallel

microcontroller, as ash speeds

increase, both pipeline operationand latency reduction measures

will be advantageous technolo-

gies or cache simplication.

Achieving SoC require-mentsImplementing EEPROM

emulation and parallel operating

in multiple ash modules. The

inclusion o EEPROM unctionality

has become a vital requirement

or high-end ash MCUs. In con-

trast to code ash, the EEPROMunctionality only requires a

capacity in the range 1/100 to

1/1000 o that o code ash and

has only a minimal eect on com-putational perormance. At the

same time, however, including

EEPROM cells on the same chip

as logic and code ash memory

makes abrication extremely di-

cult and increases costs greatly.

As a result, the EEPROM unction-

ality is preerably implementedby modiying the operation o

a cell that is actually identical to

that o the code ash. The main

dierences between EEPROMand code ash specications are

that the ability o write operations

must be about 1000 times greater

and that the time to write small

units o memory (byte or word

units) must be ast. Access units

and write speeds adopt measures

by modications o the usage

procedures, but ability o writecycles is the most difcult issue. In

one common technique, the read

time requirements are relaxed and

two cells are used to implement asingle bit. This achieves the ability

to write data rom 100K to 500K

times.

When EEPROM unctionality is

included, it is assumed that write

operations will be perormed dur-

ing user program and applicationoperation. When this is the case,

a variety o structural measures

are required. These include avoid-

ing intererence with CPU op-eration, including multiple small

EEPROMs, and making it possible

to read and write at the same time.

Technologies used to implement

these measures include using a

dedicated ash control circuit and

using background operation and/

or and RWW module.

Conclusions

In this article we have discussed

both requirements or continu-ing microcontroller technology

trends and requirements or the

next generation (rom 2010 to

2015) with regard to the ash

memory technologies embed-

ded in ash MCUs. We consider

the ollowing to be the leading

candidates or the next genera-

tion o ash modules.

1. Code ashModules which

apply a pipelined structure

to split-gate ash memory,which eatures low-voltage

high-speed read operation.

2. Data ashThe number o

write cycles can be increased

to over 100,000 by use two

ash code cells or each bit.

3. Flash controllerCircuitsthat control unctions other

than the ash memory read

operation can be included

as well in order to avoid theconiction between multi-

CPU orders.

Although we have not ocused

on this issue until now, high-speed

ash memory has a signicant

advantage because it can be used

to implement generic and cost-conscious products that operate

at speeds up to 100 MHz without

the use o a cache memory.

Although ast ash memorymakes cache memories simpler,

there is a trade o between the

difculty o making ash memory

aster and the difculty o working

around the problems inherent in

cache memory. For the genera-

tions ater the next generation, it

will be necessary to search or

ash memory technologies thatcan be eective at reducing

overall microcontroller power

consumption.

4 eetindia.com | EE Times-India

Documents

EDIOL_2009MAY08_MCP_TA_01