20
Network of Excellence on High Performance and Embedded Architecture and Compilation info 27 Welcome to ACACES’11, 10 -16 July, 2011, Fiuggi, Italy appears quarterly | July 2011 www.HiPEAC.net HiPEAC Autumn Computing Systems Week, November, 2-4, 2011, Barcelona 2 3 3 3 4 4 4 5 5 7 8 2 2 6 9 9 10 10 11 11 12 12 12 13 15 15 20 Message from the HiPEAC Coordinator Message from the Project Officer HiPEAC Announce - Book on Low Power Networks-on-Chip - Multi-core Day 2011 HiPEAC Activity - HiPEAC Spring CSW in Chamonix - PRO Cluster/Applications Taskforce Joint Meeting - HiPEAC Booth at DATE 2011 - CGO Conference Report - Self-Aware Reconfigurable Computing Systems HiPEAC News - HiPEAC Management Staff Change - Diego Caballero, Winner of the PUMPS 2010 “Nvidia Best Achievement Award” - Mateo Valero Honored at the Computer Society Awards Ceremony - A Paper of CAPS Obtains a Best Paper Award at IPDPS 2011 - High Performance Computing Wales - The Formic Board from Crete In the Spotlight - FP7 S(o)OS Project New member - IMC Trading B.V. - SYSGO - Tallinn University of Technology HiPEAC Students PhD News Upcoming Events

info27 · PDF file4419-6910-1 n Book on Low Power Networks-on-Chip HiPEAC Announce Cristina Silvano, Marcello Lajolo, Gianluca Palermo 3. info27 HiPEAC offers two Computer Systems

  • Upload
    buinga

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: info27 · PDF file4419-6910-1 n Book on Low Power Networks-on-Chip HiPEAC Announce Cristina Silvano, Marcello Lajolo, Gianluca Palermo 3. info27 HiPEAC offers two Computer Systems

Network of Excellence on High Performance and Embedded Architecture and Compilation

info27

Welcome to

ACACES’11,

10 -16 July,

2011, Fiuggi,

Italy

appears quarterly | July 2011

www.HiPEAC.net

HiPEAC Autumn Computing Systems Week, November, 2-4, 2011, Barcelona

2

3

334

445578

226

99

1010

1111

12121213

15

15

20

Message from the HiPEAC Coordinator

Message from the Project Officer

HiPEAC Announce - Book on Low Power Networks-on-Chip - Multi-core Day 2011

HiPEAC Activity - HiPEAC Spring CSW in Chamonix - PRO Cluster/Applications Taskforce Joint Meeting - HiPEAC Booth at DATE 2011 - CGO Conference Report - Self-Aware Reconfigurable Computing Systems

HiPEAC News - HiPEAC Management Staff Change - Diego Caballero, Winner of the PUMPS 2010 “Nvidia Best Achievement Award” - Mateo Valero Honored at the Computer Society Awards Ceremony - A Paper of CAPS Obtains a Best Paper Award at IPDPS 2011 - High Performance Computing Wales - The Formic Board from Crete

In the Spotlight - FP7 S(o)OS Project

New member- IMC Trading B.V.- SYSGO- Tallinn University of Technology

HiPEAC Students

PhD News

Upcoming Events

Page 2: info27 · PDF file4419-6910-1 n Book on Low Power Networks-on-Chip HiPEAC Announce Cristina Silvano, Marcello Lajolo, Gianluca Palermo 3. info27 HiPEAC offers two Computer Systems

info27

Intro

Message from the HiPEAC CoordinatorDear Friends,

On May 7, I was struck by a column in the Wall Street Journal: Long-Prized Tech Visas Lose Cachet. It discussed the evolution of the H-1B visa program, created in 1990 to bring more skilled foreign workers to the US. In the years 2001-2003, the number of H-1B visas topped at 195,000 per year. Starting in 2004, it was brought back to the initial cap of 65,000 per year. Up until 2008, all 65,000 were applied for on the first day the call was launched. In 2009, there were only 45,000 applica-tions during the first month; in 2010 only 16,500; and this year only 8,000. An increasing number of US-educated foreign workers prefer to return to their booming home countries, where their career perspectives are far better, where there is a burgeoning work force, where the cost of living is lower, and where their friends and families live. Several US venture capitalists are following them. In 2010, a staggering 134,800 Chinese returned to China to start a new life, 25% more than the year before.

Koen De Bosschere

Combined with the fact that over 50% of all Silicon Valley start-ups are launched by immigrants, it is clear that this trend will hurt the US econo-my in the short to medium term. I do not have such numbers for Europe, but I am sure that sooner or later we will face the same challenges. It is clear that the new economies will not only compete for a bigger global mar-ket share, but also for talent. As the quality of life improves in these coun-tries, the Western countries are grad-ually losing their attractiveness, and there is not much we can do about it. Therefore, I believe that it is very important for the Western countries to keep investing in education of the local work force (especially in Science, Technology, Engineering, Math and entrepreneurship). Otherwise, it will

be very hard to compete with the new economies in the longer term.In April, we had our successful Chamonix Computing Systems Week, held in conjunction with the CGO conference. No less than 29 HiPEAC members took the opportunity to attend the two events, and to com-bine it with some skiing in almost subtropical temperatures. The attend-ees very much appreciated the tech-nical program, the networking, the scenery and the French cuisine. Next Computing Systems Week will take place in Barcelona, in the first week of November, co-located with the Barcelona Multi-core Workshop.In May, we passed the formal review of the third year of HiPEAC. All work packages are on schedule, and we expect to be able to successfully

conclude HiPEAC2. In parallel, we are preparing HiPEAC3 which was posi-tively evaluated in Call 7. The transi-tion between HiPEAC2 and HiPEAC3 is planned on February 1, 2012 at the HiPEAC conference.This newsletter issue is traditionally the summer school issue, which for the first time takes place in the city of Fiuggi, a historical town in the vicin-ity of Rome. As usual, this summer school marks the beginning of the summer break for me. I wish you a relaxing summer with your family and friends, and I hope to see you again after the summer holiday in good health, and full of exciting plans for the year to come.

Take care,Koen De Bosschere n

HiPEAC News

As of February 15th, Jeroen Borghs has joined the HiPEAC management team in Ghent. Jeroen holds a Master degree in History. Before joining the team, he worked for the Belgian Representation of the United Nations High Commissioner for Refugees (UNHCR).

Jeroen will gradually take over the management responsibilities from Jeroen Ongenae. His contact address is: [email protected]. n

HiPEAC Management Staff Change

2

Page 3: info27 · PDF file4419-6910-1 n Book on Low Power Networks-on-Chip HiPEAC Announce Cristina Silvano, Marcello Lajolo, Gianluca Palermo 3. info27 HiPEAC offers two Computer Systems

info27

Panos [email protected]

We have recently finished the evalu-ation of the last Call for Proposals in Computing Systems. More than 60 proposals were received with a remark-ably high quality: again 3 out of 4 received proposals were marked above thresholds. The total grant requested by all the received proposals was above 170 M€. As a reminder, the available funding was 45 M€.

The majority of STREPs received were related to the “Parallel & Concurrent Computing” theme of Computing Systems. The proposals covered topics like higher-level programming tools hiding parallelism from the user; new approaches to scalability of high-performance computing application codes; programming of accelerators; and programming paradigms for large-scale data centres.

The “Customization” theme was also covered well by the received propos-als ranging across computing systems domains including embedded com-puting, and with a strong applica-tion focus. The topics covered include reconfigurable architectures, hetero-geneous multi-cores on a single chip, toolchains for dependable and fault-tolerant systems, system modelling and simulation. As a cross-cutting issue, a number of proposals covered energy efficiency and reducing power con-sumption on chip and beyond chip.

Mixing safety-critical with non safety-critical tasks was the main issue addressed by proposals in the “Virtualisation” theme while proposals in the “Architecture and Technology” theme addressed issues mainly related to design flows for emerging chip

fabrication technologies.

Proposals covered a wide range of application domains such as aerospace, automotive, energy, communication and software industries. The latter includes high-performance and embed-ded systems, data centres and cloud systems as well as tool developers and providers; among these there was a high percentage of SMEs. Interest was also shown in life sciences and digital media. Proposals have demonstrated a good balance between academic partners and the various stakeholders in the value chain.

Over the next couple of months, our Cordis web site will have more details about the selected projects.

Panos Tsarchopoulos n

Message from the Project Officer

In recent years, both Networks-on-Chip, as an architectural solution for high-speed interconnect, and power consumption, as a key design con-straint, have continued to gain inter-est in the design and research com-munities, since power and energy issues still represent one of the limit-ing factors in integrating multi- and many-cores on a single chip. This book:• Covers power and energy aware

design techniques from several perspectives and abstraction lev-els and offers a single-source reference to some of the most important design techniques pro-posed in the context of low-power design for networks-on-chip archi-tectures.

• Describes the most important

design techniques that were invented, proposed, and applied to reduce both dynamic power and static power dissipation in networks-on-chip based architec-tures;

• Applies state-of-the-art, low-pow-er design techniques to the design of Networks-on-Chip, to demon-strate methodology for design of high-speed, low-power intercon-nect;

• Offers a single source reference to the latest research, otherwise available only in disparate journals and conference proceedings.

Read more on http://www.springer.com/engineering/circuits+%26+systems/book/978-1-4419-6910-1 n

Book on Low Power Networks-on-ChipHiPEAC Announce Cristina Silvano, Marcello Lajolo, Gianluca Palermo

3

Page 4: info27 · PDF file4419-6910-1 n Book on Low Power Networks-on-Chip HiPEAC Announce Cristina Silvano, Marcello Lajolo, Gianluca Palermo 3. info27 HiPEAC offers two Computer Systems

info27

HiPEAC offers two Computer Systems Weeks (CSW) twice a year: once in spring, once in autumn. The first CSW this year was held in Chamonix from April 6-8, 2011, effectively co-locat-ing the CSW with the International Symposium on Code Generation and Optimization (CGO), which was held during April 2-6, 2011. I had attended several CSW’s during both the first and the second installment of the HiPEAC network, so I pretty much knew what I could expect. Let me start by saying that I was not disappointed. As in other CSW’s, there were two parts: the industrial workshop and the cluster meetings.

I found the industrial workshop to be the highlight of the week. It was organised on Wednesday afternoon, and we were entertained with seven informative talks. Krisztian Flautner opened quite strongly with an awe-some talk on the vision ARM has for the (near) future and they would like to see implemented over the next years. He talked about the move towards the Internet of Things and the shift in both functionality and requirements it was bringing about. The bottom line was that we are moving towards a new optimisation function: functional-ity over ($$$ x energy), together with a

HiPEAC Activity

move towards having the microproces-sor as the system. After that, Intel’s C.J. Newburn talked about programming models, compilers and tools. I thought the talk was pretty cool and interesting, and strongly reminded me of several Data Parallel Haskell talks I had seen before. The third talk before the break was given by David August. He told us how we could harness the power of multi-core machines, but branch-ing off key parts of the program to multiple cores, thereby harnessing the potential parallelism that is present in the program. The conclusion was that we need better support by compilers to restore the original curve of yearly performance increase of computer sys-tems. After the break -- I will be brief -- we heard three more interesting talks: (i) a discussion about compilation for mobile devices, (ii) how to go from research to an industrial project, and (iii) some insights into the differences between research and academia.

The following one and a half days were devoted to cluster meetings. I was especially interested in the Friday morning meetings on both compila-tion and virtualisation. My expecta-tions were quite fulfilled, as both the talks and the discussions following them were quite interesting.

Allow me to say few words about the venue. The CSW took place in the most amazing surroundings. With a view on the Mt.-Blanc, we were lodged in a cosy town, with nice restaurants to have some informal discussions after the main program. I really enjoyed my stay and had several conversations that were both enlightening and encouraging as far as research is concerned.

Andy Georges,Ghent University

n

HiPEAC Spring CSW in Chamonix

Reserve September 15 for Multi-core Day 2011 organized by the Swedish Multi-core Initiative. As in previous years, we have excellent keynote speakers. Mateo Valero from BSC/UPC, Arch Robison from Intel, Wen-Mei Hwu from IUIC, and Charles

HiPEAC Announce

Leiserson from MIT are all confirmed. The rest of the program will feature a mix of academic and industrial presentations. The event is free but registration is required: http://www.sics.se/node/7813

The first annual Cloud Day is organ-ized back-to-back with Multi-core Day on September 14 and you might want to combine the both on the same trip. This event is also free,

but separate registration is required: http://www.sics.se/node/7812.

You are all welcome to Stockholm in September!

On behalf of the Swedish Multi-core Initiative.

Mats Brorsson, KTH and SICS

n

Multi-core Day 2011

CSW: on a session and during the lunch break

4

Page 5: info27 · PDF file4419-6910-1 n Book on Low Power Networks-on-Chip HiPEAC Announce Cristina Silvano, Marcello Lajolo, Gianluca Palermo 3. info27 HiPEAC offers two Computer Systems

HiPEAC Activity

On the afternoon of the second day of the CSW in Chamonix, the Programming Models and Operating Systems cluster and the Task Force on Applications had a joint session focused on the tools and methodologies for the systematic analysis and guide for the optimization of applications. The aim of this session, chaired by Gabor Dozsa from BSC, was to present attendees with real examples of tools and methodologies developed by top European companies and aca-demic institutions.

The following presentations took place during the session:

vfEmbedded (Vector Fabrics): Alexey Rodriguez presented the Dutch compa-ny’s analysis tool. This application, that can run in any web browser, shows the parallelization alternatives that are guar-anteed to work, without introducing data races, and preserving the semantics of the program. It can also perform an estimation of the expected performance gain and outputs guidelines for pro-gram transformations.

ThreadSpotter (Rogue Wave): pre-sented by Sebastian Grimonet, ThreadSpotter is a profiler that gathers data and provides specific guidance on

performance issues by identifying them, estimating each issue’s importance (and rank ordering them), and guiding the developer to the location in the source code where the issues are located.

Paraver (BSC): the Barcelona Supercomputing Center sent Jose Carlos Sancho to present Paraver. This application is a powerful visualization and analysis tool to understand the performance of parallel applications. It provides a qualitative global perception of the application behavior which is very useful to users in order to easily spot scaling issues. Moreover, it also provides detailed quantitative information of the application behavior in order to uncover the particular causes that are limiting the performance of applications.

Paralax (Ghent U.): Hans Vandierendock from University of Ghent presented Paralax. The tool – co-created by Hans and Sean Rul – aims to bring auto-matic parallelization to irregular and pointer-intensive programs. The main method is by extending the sequential programming model with annotations that help to expose parallelism in pro-grams. Paralax contains a profiling tool to optimistically find large amounts of parallelism and combines this with an

automatic parallelizing compiler that considers only conservatively correct parallelism.

Memory Analyser (Codeplay): Andrew Richards – Codeplay’s founder – offered their company’s approach to memory analysis. He demonstrated the ability of their compilers to insert memory instru-mentation into users’ code and show some results of that instrumentation on some sample graphics software.

The presentations made by our five speakers had an important “demo” component and they were very inter-active. The session highlighted the Network of Excellence’s efforts to advance the state-of-the-art in method-ologies and tools, both at the industry and academic level. We believe these topics are of fundamental importance to the future of European research, and as such we hope to organize similar events in the future.

The session was video recorded and it is available at the HiPEAC web seminars site: http://hipeac.ac.upc.edu/seminars/

Victor Garcia, Universitat Politècnica de Catalunya and Barcelona Supercomputing Center n

PRO Cluster/Applications Taskforce Joint MeetingSystematic Approaches for the Analysis/Optimization of Applications

info27

Following the very positive experience of the previous two years, HiPEAC organized a booth at the exhibition of the DATE11 conference, one of the flagship events in electronic design and automation, held annually in Europe. The booth was again set up jointly by the partners from Ghent and Aachen. For us, as the HiPEAC representatives, the main goal was to

HiPEAC Activity

disseminate major activities of the network. Especially now, at the end of the HiPEAC-2 period, there are a lot of collaborative project results and excellent events that HiPEAC can proudly present to the scientific community. Another objective was to get to know (and possibly to acquire) new people. Even though 10 years of network building brought

together over 200 different institu-tions around Europe, the community is still dynamic and its growth is of primary importance. The conference location was Grenoble, directly at the foot of Alps. It attracted more than a thousand people both from academia and industry during the third week of March. Majority of attendees were

HiPEAC Booth at DATE 2011

5

Page 6: info27 · PDF file4419-6910-1 n Book on Low Power Networks-on-Chip HiPEAC Announce Cristina Silvano, Marcello Lajolo, Gianluca Palermo 3. info27 HiPEAC offers two Computer Systems

info27

from European Universities but also Asia and America actively participat-ed. Industrial representatives included large international companies (EDA mainly) and several start-ups from Europe. The conference itself featured 77 technical sessions on different aspects of electronic design: from low level HW to system design, run-time sup-port and programmability issues. According to the statistics published on the DATE website a few days after the conference was over, this year the

event got increase in the number of exhibition attendees by 40% com-pared to the previous year. Was it a matter of the central position of the exhibition area or growing interest in industrial projects? Whatever the rea-son may be, a similar trend was also noticed at the HiPEAC booth.For a widespread NoE like HiPEAC, it is usually difficult to meet people on European conference who are not involved in its activities. This time we have observed growing interest in HiPEAC from non-EU visitors. They were pointing out the indisputable significance of such an association for steering collaborations between researchers. Many of them expressed

their interest in participating to the major HiPEAC events: conference or summer school. Though our main goal was to introduce HiPEAC, we cordially welcomed the network mem-bers, enquiring about their recent achievements research-wise and on the networking side. Overall the booth was running very successfully this year with more than 50 people visiting us in three days. As a result, several of them have already been elected as HiPEAC members. You can find information on some of them already in this newsletter issue!

Anastasia Stulova & Sergey Yakoushkin,RWTH Aachen University n

HiPEAC News

Diego Caballero, Winner of the PUMPS 2010 “Nvidia Best Achievement Award”The Barcelona Computing Week 2010 co-directors: Mateo Valero (Barcelona Supercomputing Center, BSC) and Wen-mei Hwu (University of Illinois) are proud to announce the winner of the PUMPS 2010 “NVIDIA Best Achievement Award” Contest. PUMPS 2010 was sponsored by the Barcelona Supercomputing Center, HiPEAC, NVIDIA, University of Illinois and Universitat Politècnica de Catalunya. A jury composed by researchers from BSC/UPC, University of Illinois and NVIDIA decided after a very close vote

that the winner of the 2010 contest is Diego Caballero, Master student at UPC and Research Student at Barcelona Supercomputing Center, in the Computer Sciences Department. Diego chose the “Coulombic Potential” chal-lenge, took the initial CUDA code given to the participants, and optimized it to obtain an impressive 350x speed-up. Since last summer, BSC is a 2010 CUDA Research Center, the first one in Spain. Mateo is the PI and Nacho Navarro is the managing director of the center. The prize for the contest winner was

a Fermi C2050 GPU kindly donated by NVIDIA. Diego received it at a ceremony from the hands of BSC’s director Mateo Valero, together with a certificate and the recognition of his peers. Diego, soon after winning the contest, joined the GPU research team at the Barcelona Supercomputing Center, and is now actively doing research in accelerators and programming models for HPC.Diego Caballero says that the PUMPS Summer School offered him the oppor-tunity to highly improve his skills and knowledge in this field. He also adds

6

Page 7: info27 · PDF file4419-6910-1 n Book on Low Power Networks-on-Chip HiPEAC Announce Cristina Silvano, Marcello Lajolo, Gianluca Palermo 3. info27 HiPEAC offers two Computer Systems

info27

This year’s CGO conference was for the first time located outside of North America in the heart of the French Alps in Chamonix. 175 attendees from all over the world participated, set-ting the highest record for CGO. This may in fact not be too surprising given the idyllic location, the high-quality program and the co-location with the HiPEAC System Week. The program committee, chaired by Carol Eidt (Microsoft) and Michael O’Boyle (University of Edinburgh), selected a total of 28 papers out of a record 105 submissions (27% acceptance rate).

The main conference had two exciting keynote talks. On the first day, Erik Altman (IBM) gave his view on why high-level languages failed to reach the performance of low-level languag-es such as C, despite the additional information provided to the compiler. He presented two IBM projects that

attempt to address these issues. The WAIT Performance Tool tries to iden-tify performance bottleneck in Java applications while Liquid Metal is a new programming language and runt-ime that seeks to address the problem of programming heterogeneous plat-forms efficiently.Xavier Leroy (INRIA) delivered the sec-ond keynote on formal verification for compilers. He presented CompCert, a state-of-the-art formally verified com-piler and talked about the challenge in verifying optimisations. Interestingly, despite decades of research in compil-ers, they are still not trusted for critical applications (i.e. aircraft control, medi-cal equipment or nuclear power plant).

Among the numerous confer-ence papers, the joint work of Ben Hardekopf (University of California, Santa Barbara) and Calvin Lin (University of Texas, Austin) “Flow-

Sensitive Pointer Analysis for Millions of Lines of Code.” won the best paper award. The excellent talk of Joseph L. Greathouse (University of Michigan, Ann Arbor) won the best student pres-entation award with his work entitled “Highly Scalable Distributed Dataflow Analysis.” In addition to the main ses-sions, the conference also had a stu-dent poster session. A large crowd was attracted to the 15 posters (and the simultaneously held cocktail reception) making it one of the most popular events of the conference. Alexandra Jimborean (University of Strasbourg) won the best poster award for her work on “VMAD: a Virtual Machine for Advanced Dynamic Analysis of Programs.”

Finally, we went to Montenvers for the social event. After an approxima-tively 20 minutes cog-wheel train ride, zigzagging through pine trees and tunnels carved in the rock, we arrived at 2000m and were welcome with some nice fresh wine. After wandering around for an hour, admiring La Mer de glace (a 7km long and 1km wide glacier) and the breathtaking Aiguille Verte (4122m high!), we all had din-ner at the 100 years old Grand hotel. Tartiflette, a typical cheese dish of the region, was served and wine was flow-ing again. From that moment on, the night had a life of its own and I vaguely remember taking the last train down to Chamonix with all the others.

Christophe Dubach,The University of Edinburgh n

CGO Conference Report

During the poster session at CGO11

HiPEAC Activity

that he “found the course indispen-sable if you are really interested in GPGPUs because the lectures that Wen-mei Hwu and David Kirk offered us were of that kind of thing that you can not learn from books. I want to thank NVIDIA and Barcelona Supercomputing

Center, especially to Nacho Navarro, for organizing an event of this quality in my country. I hope to attend the next one and it is certainly something to recom-mend for everybody”.More information regarding the Barcelona Computing Week at http://bcw.ac.upc.edu n Diego Caballero receives the award from Mateo Valero

7

Page 8: info27 · PDF file4419-6910-1 n Book on Low Power Networks-on-Chip HiPEAC Announce Cristina Silvano, Marcello Lajolo, Gianluca Palermo 3. info27 HiPEAC offers two Computer Systems

info27

Self-Aware Reconfigurable Computing Systems for Energy Saving and Performance EnhancementImagine a revolutionary computing system that can observe its own exe-cution and optimise its behaviour with respect to the external environment, to the user and to the applications demands. Provide users with the pos-sibility to specify their desired goals along with constraints in terms of energy budget, time, and computa-tional accuracy. Design a computing chip performing better, according to a set of goals expressed by the user, the longer it runs an application.

The full range of computing infra-structures, from embedded devices to personal computer to servers to supercomputers, would benefit from the adoption of this technology.

This research focuses on building a Self-Aware Reconfigurable Computing System to fulfil the vision above. This system is given a goal and a budget - it then finds the best way to accom-plish the goal despite changes in both the available resources and the environment. A self-aware system has cognitive mechanisms in its trusted components to both observe and to affect the execution. Since it is impos-sible to pre-configure all possible sce-narios, these systems also implement learning and decision making engines, as described in one of our work at ICAC2011, in a judicious combination of hardware and software (FPL2010, ICAC2010, AHS2010, AHS2011) to determine the appropriate actions based on given observations.

To achieve the vision just described, a Self-Aware Reconfigurable computing system is no longer viewed as a static bunch of hardware components with a passive set of applications running on top of an operating system, which properly coordinates the underling architecture. It becomes an active sys-

tem where the hardware, the applica-tions and the operating system are seen as a unique entity. This entity needs to adapt itself to guarantee the goal(s) achievement. Figure 1 presents the general overview of the hard-ware and software architecture com-ponents of our self-aware computing system.

This research is a continuation of the collaboration started in 2009 between MIT and Politecnico di Milano. Prof. Marco D. Santambrogio spent a post-doctoral period at MIT, funded also by a HiPEAC Collaboration Grant. Considering the results obtained dur-ing this first collaboration, the two groups decided to work together strengthening their common research on self-aware reconfigurable comput-ing systems. HiPEAC played a key role in supporting this collaboration. Thanks to the collaboration grant several people of the two groups had the chance to spend from 2 weeks up to 3 months working with the other group. New results have been published and presented at FPL, AHS, ICAC and CHA’N’GE.

The CHA’N’GE (Computing in Heterogeneous, Autonomous ‘N’ Goal-oriented Environments) work-shop deserves a special mention. It has been established thanks to HiPEAC and its “Self-Aware Reconfigurable Computing Systems for Energy Saving and Performance Enhancement” col-laboration grant. Our goal was to build a community of researchers in systems and self-aware and auto-nomic techniques. Our hope is that establishing self-aware techniques for hardware, compilers, operating sys-tems, and system software will help to cope with the skyrocketing complexity of modern computing platforms. Starting from the joint research of Politecnico di Milano and MIT, peo-ple at UIC and Harvard joined the community, broadening the group expertises. More information about the research, results, documents, and tools are available at the CHANGE group website: http://www.change-grp.org/

Marco D. Santambrogio,Politecnico di Milano

n

HiPEAC Activity

Overview of the proposed self-aware computing system.

8

Page 9: info27 · PDF file4419-6910-1 n Book on Low Power Networks-on-Chip HiPEAC Announce Cristina Silvano, Marcello Lajolo, Gianluca Palermo 3. info27 HiPEAC offers two Computer Systems

info27

HiPEAC News

HiPEAC News

The IEEE Computer Society honored the accomplishments of 15 distinguished technologists who have made major contributions to their fields during an awards dinner on Wednesday 25th May in Albuquerque, New Mexico. Mateo Valero, Director of the Barcelona Supercomputing Center (BSC) and full professor at the Computer Architecture Department at the Technical University of Catalonia (UPC), was one of the honored in this ceremony although he was recipient of the 2009 Harry H. Goode award. The Goode Award recognizes achievement in the infor-mation processing field—either a sin-gle contribution of theory, design, or technique of outstanding significance, or the accumulation of important con-tributions on theory or practice over an extended period. Mateo Valero was rec-ognized for his “seminal contributions to vector, out-of-order, multithreaded, and VLIW processor architectures”, according to the award Organization. Professor Valero, whose research inter-

ests are focused on high-performance architectures, is a former winner of the Eckert-Mauchly Award, which recog-nizes contributions to digital systems and computer architecture. Professor Mateo Valero is also the recipient of three national Spanish awards: the King Jaime I award, Spain’s most prestigious award in recognition of outstanding basic research; the Julio Rey Pastor Award, which recognizes research on IT technologies, and the Leonardo Torres Quevedo Award, which recognizes research in engineering. He has been awarded honorary doctorates by the University of Chalmers, the University of Belgrade, the University of Las Palmas de Gran Canarias and the University of Zaragoza, both in Spain, and the University of Veracruz in Mexico. Valero became a founding member of the Royal Spanish Academy of Engineering in 1994 and became an academic correspondent for the Spanish Royal Academy of Science in 2005 and a member of the Royal Spanish Academy

of Doctors and Academy of Europe the following year. He is a Fellow of the IEEE and ACM and an Intel Distinguished Research Fellow. In 1998, his home-town, Alfamén (Zaragoza) named him a Favourite Son and also named its public college after him. n

Mateo Valero Honored at the Computer Society Awards Ceremony

A Paper of the CAPS Research Group from the University of Murcia Obtains a Best Paper Award at IPDPS 2011

Sorel Reisman, 2011 IEEE Computer Society President, left, congratulates Walid A. Najjar,

accepting the Harry H. Goode Memorial Award on behalf of his long-time colleague

Mateo Valero.

The paper entitled “GLocks: Efficient Support for Highly-Contended Locks in Many-Core CMPs”, authored by José L. Abellán, Juan Fernández and Manuel E. Acacio, has been grant-ed the Best Paper Award in the Architectures Track by the commit-tee of the 25th International Parallel and Distributed Processing Symposium (IPDPS 2011) held in Anchorage (AK), USA from May 16 to May 20. Their authors belong to the CAPS Research Group from the University of Murcia, Spain. In this paper, they deal with

the synchronization problem to exploit thread-level parallelism on many-core CMPs. In these architectures, lock acquisition through busy waiting on shared variables generates additional coherence activity which interferes with applications. On the other hand, lock contention causes serialization which results in performance degrada-tion. The paper proposes and evalu-ates GLocks, a hardware-supported implementation for highly-contended locks in the context of many-core CMPs. GLocks use a token-based

message-passing protocol over a dedi-cated network built on state-of-the-art technology. This approach skips the memory hierarchy to provide a non-intrusive, extremely efficient and fair lock implementation with negligible impact on energy consumption or die area. A comprehensive compari-son against the most efficient shared-memory-based lock implementation for a set of microbenchmarks and real applications quantifies the goodness of GLocks. n

9

Page 10: info27 · PDF file4419-6910-1 n Book on Low Power Networks-on-Chip HiPEAC Announce Cristina Silvano, Marcello Lajolo, Gianluca Palermo 3. info27 HiPEAC offers two Computer Systems

HiPEAC News

info27

In July 2010, the Welsh Assembly Government announced the funding of High Performance Computing Wales (HPC Wales), a £40m major infrastruc-ture project to provide an advanced supercomputing facility in Wales. The project is funded from the following sources:

£19m from ERDF and ESF European funds£10m from the UK’s Department for Business, Innovation and Skills (BIS)£4m from collaborating institutions£5m from the Welsh Government£2m private sector and research income

The £40m investment will cover infra-structure development, equipment,

software research, management and operational costs over the first five years to 2015, after which HPC Wales will become self-supporting and sustain-able.

HPC Wales consists of three main ele-ments:

World-class HPC capacity: the pur-chase of large-scale super computing technology to complement existing HPC facilities in Cardiff and Swansea with high-speed links to satellite spokes in the five major research universities in Wales. The network will link to business innovation centres and research centres in Wales and globally.HPC Institute: this will deliver advanced research, focused on strategic partner-ships in both academic and private sec-tor, with priority given to research with direct economic impacts and benefits.HPC Academy: the sustainability of the Research Institute will depend upon the ability to develop technical research skills and a pipeline of talent i.e. capabil-

ity. The Academy will develop HPC skills training and will be open to research-ers in Welsh SMEs and researchers in universities working collaboratively with businesses.

The main hubs for HPC Wales will be based in Cardiff and Swansea, linked to spokes at Aberystwyth, Bangor, Glamorgan, University of Wales Alliance Universities and the Technium busi-ness innovation centres around Wales. Fujitsu were named as successful bidder for HPC Wales in March 2011, with the aim of being fully operational before the end of 2011, eventually delivering 190 teraflop performance.

HPC Wales are currently working close-ly with Fujitsu, as well as recruiting staff at the partner institutions. However, we would warmly welcome future collabo-rative research activities with HiPEAC members and institutions. For further information, please contact Dr Tom Crick ([email protected]) or see: http://www.hpcwales.co.uk/. n

High Performance Computing Wales

HiPEAC NewsThe Formic Board from Crete, for Building FPGA Prototypes of Manycore Architectures

Field-programmable gate arrays (FPGAs) have long been used to prototype new and exciting architectures. However, commercially available FPGA boards provide very limited communication connectivity -- just a few high-speed serial links, at the price level around a thousand Euros; to get a few tens of links, one has to pay several thousand Euros and use bulky boards and bulky coaxial connectors.

To address this problem, in Crete, Greece, we designed our own low cost, high connectivity FPGA board; at FORTH-ICS, we call this board “Formic”. Alpha-stage testing, performed in May,

showed that the Formic design is fully functional and correct in its first version.

As shown in the photograph, Formic is a small board, just 10 cm on each side. It has eight high-speed serial links avail-able for external connections; each of them delivers 2.5 Gbits/s per direction, through convenient and inexpensive SATA connectors and cables. At the center of the board, under a passive cooler, is a large but low-cost Xilinx Spartan-6 LX150T FPGA. Around it, there is one DRAM chip (128 MBytes, DDR2, 400 MHz), and three (3) SRAM chips (1 MByte, ZBT, 167 MHz each). The board also contains power convert-

ers/regulators, crystal oscillators, a Xilinx Flash memory for FPGA configuration, a JTAG chain, and an RS-232 port for debugging. At peak memory and serial link activity, the board consumes 8 Watts. The PCB has 10 layers of half-ounce copper, separated by FR-4; the smallest holes are 0.3mm in diam-eter, and the smallest tracks are 5 mils (0.12mm); it was manufactured by Pan Technical, and assembled by Prisma (two Greek companies). The cost is below one thousand Euro per board.

We configure the FPGA on each board to contain eight 8 processors, with their private L1 caches; also, the tags of the

10

Page 11: info27 · PDF file4419-6910-1 n Book on Low Power Networks-on-Chip HiPEAC Announce Cristina Silvano, Marcello Lajolo, Gianluca Palermo 3. info27 HiPEAC offers two Computer Systems

info27

private L2 caches are in the FPGA, while their data are in the SRAM chips. The FPGA also contains per-cache DMA/prefetch engines, the 5-channel DRAM controller, and the 8 link interfaces, all connected via a 22-port crossbar. The hardware provides no coherence among the caches, but the runtime software will, at task boundaries.

Our next step, in the Computer Architecture and VLSI Systems (CARV) Laboratory of FORTH-ICS, will be to have 64 Formic boards built for us, to interconnect them in a 3D mesh using their serial links, and to also connect

them to the eight A9 cortex processors in two ARM chips. The resulting het-erogeneous many-core prototype will contain 520 processors: 512 “worker” cores, which will be executing applica-tion tasks, and 8 “control & scheduler” cores, executing the runtime system.

The first use of this 520-processor pro-totype will be to execute applications written in the OMP-SS programming model, where the programmer identi-fies tasks and all their I/O arguments, while the runtime software undertakes their parallelization. This work is per-formed in the context of the ENCORE

project (http://www.encore-project.eu), funded by the European Union (FP7 STREP). In addition to this fund-ing, ARM has provided the A9 cortex development systems, and Xilinx has donated the 64 FPGA chips. For more information about our research in Scalable Multicore Systems, please visit http://archvlsi.ics.forth.gr/ipc

Spyros Lyberis,FORTH-ICS, Heraklion n

In the Spotlight FP7 S(o)OS Project

The S(o)OS European Project (Service-oriented Operating Systems) aims to design a novel software stack that is suitable for future and emerging mas-sively parallel, distributed and heteroge-neous systems. The attention is strongly focused on the architecture of the Operating System and its kernel, and the interactions with the applications through suitable novel programming paradigms. These will allow the major-ity of the programmers to take advan-tage of the computing power available on new generation hardware.The Operating System needs to play a better job in mediating the interaction between applications and the hardware by providing a set of abstractions which

need to account for the novel mas-sively parallel availability of computing power and resources. Furthermore, the OS and particularly its kernel needs to get rid of scalability problems that arise when a traditional OS tries to manage too many resources. Indeed, one widely recognized issue is that the Operating Systems mainly used for general-purpose computing nowadays have serious bottlenecks when trying to scale to platforms with a high number of cores. This is already evident with servers readily available today with up to 16, 32 and 48 cores.

New kernels for the future machines need to be highly modular, heteroge-neous, and they need to provide non-centralised, heavily distributed services to applications. To this purpose, their own internal architecture cannot exhibit centralised elements at risk of becom-ing serious bottlenecks when trying to scale to thousands of cores, but it needs to be heavily distributed as well. One more problem is the heterogeneity of the hardware platforms. These cannot be easily exploited by the upper layer software stack, unless a big effort is put in place for rewriting major parts of the OS, middleware and applications.

The approach being followed in the S(o)OS project comprises a set of elements which can be summarised as follows:

• proper hardware description lan-guages and simulation tools are being developed in order to experi-ment with a variety of possible hard-ware platforms;

• investigations are being done on the type of communication models and protocols that may effectively be used for the data distribution and communications among the unprec-edented number of computing ele-ments;

• proper code and data distribution and scheduling mechanisms are being designed, so as to achieve an efficient utilisation of the available resources, while meeting timing/QoS requirements of the applica-tions, when present;

• novel programming paradigms are being designed so as to achieve scalability and efficiency of the soft-ware, and take advantage of the novel OS design.

n

Consortium:High-Performance Computing Center of Stuttgart, DE (Coordinator)Scuola Superiore Sant’Anna, ITEcole Polytechnique Fédérale de Lausanne, CHInstituto de Telecomunicações, PTUniversity of Twente, NLWebsite: http://www.soos-project.eu/.Duration: February 2010 – January 2013

11

Page 12: info27 · PDF file4419-6910-1 n Book on Low Power Networks-on-Chip HiPEAC Announce Cristina Silvano, Marcello Lajolo, Gianluca Palermo 3. info27 HiPEAC offers two Computer Systems

info27

Lodewijk Bonebakker is Head of Technology R&D at IMC Trading B.V., located in Amsterdam, the Netherlands. He obtained his masters in Astronomy at Leiden University in 1994, and his doctorate in Systems Engineering at Delft University in 2007.Between 1995 and 2009, Lodewijk worked for Sun Microsystems Inc. After having worked in technical sales and consultancy, he joined the Computer Architecture and Performance group at Sun Microsystems Laboratories in 2002, where he worked on workload characterization and analysis of current and future computer systems. In June 2009 he joined IMC Financial Markets as Head of Technology R&D. Lodewijk’s main topic of interest is the interaction between information technology and business processes. Lodewijk is spe-

New Member

cifically interested in researching how HiPEAC’s rich portfolio of technologies can impact IMC’s business, and how he can stimulate HiPEAC research to help solve IMC’s issues.

IMC Financial MarketsIMC Financial Markets is one of the world’s leading proprietary trad-ing firms and a key market maker in various products listed on exchanges throughout the world. All of IMC’s trading strategies are proprietary, have a short-term focus and are considered as ‘low latency trading’. IMC’s con-tinued success is based on a unique combination of talents and technology, with interaction between specialists such as traders, quants, IT specialists, and business analysts. This enables IMC to continuously improve existing

strategies, infrastructure, algorithms and software, and to create new trad-ing strategies and platforms necessary to stay ahead of the competition. All of the most important trading systems and software are developed in-house. Staying ahead of the technology curve in the development and adoption of technology is not only an IMC mission, but is also very much engrained in the company culture.

Contact information:Lodewijk [email protected] n

IMC Trading B.V.

SYSGO is an international company providing operating system technol-ogy, middleware, and software services for the real-time and embedded mar-ket. SYSGO has facilities in Germany, France, The Czech Republic and North America, and offers a global distribu-tion and support network, including Europe and the Pacific Rim.

A long track record of expertiseSYSGO is specialised in design, imple-mentation and configuration of device software for embedded systems. Since its foundation in 1991, the company has been focused on real-time operat-ing systems for use in embedded devic-es. It has ported operating systems to various different architectures and platforms, provided professional serv-ices and consultancy for these systems and acted as a distributor of operating systems with a strong technical back-ground. The company has success-

New Member

fully completed a number of projects to certify operating system software for use in safety critical applications in civil aircraft according to standard DO-178B, Level A.During the first ten years the company also developed expertise in embedded Linux that resulted in the introduction in 1999 of the first embedded Linux distribution especially designed and packaged for industrial needs. This distribution, called ELinOS, was the first product of its kind not only in Europe but in the world.In parallel and since 1998, SYSGO began to develop its own operating system approach. After having evalu-ated the concept based on the L4 microkernel, SYSGO realized that the concept couldn’t support the highest levels of safety and security require-ments SYSGO’s customers were ask-ing for. The initial implementation has therefore gradually evolved over sev-

SYSGO

eral years of its practical application to the real-time, embedded space. The result of this evolution is the PikeOS microkernel, which today is part of SYSGO’s product portfolio. Its target markets are aerospace/defence, indus-trial automation, automotive, trans-portation, consumer electronics and network infrastructure. PikeOS enables multiple operating system interfaces to work on separate sets of resources within a single machine. Because of the resource separation enforced by the PikeOS microkernel, multiple applica-tions with different safety and security requirements are able to co-exist in a single machine. Thus, PikeOS can be regarded as a MILS separation kernel. That’s why another important mar-

12

Page 13: info27 · PDF file4419-6910-1 n Book on Low Power Networks-on-Chip HiPEAC Announce Cristina Silvano, Marcello Lajolo, Gianluca Palermo 3. info27 HiPEAC offers two Computer Systems

info27

New Member

Dependable Embedded Systems Research at Tallinn University of TechnologyTallinn University of Technology (TTÜ), founded in 1918, is the oldest and biggest university in Tallinn and the second largest in enrolment in Estonia. TTÜ has over 14,000 students, more than 1200 academic staff members and offers a wide variety of educational degree programmes, which include both traditional and new fields of mod-ern technology, economics and busi-ness. Embedded systems research is concentrated to the Department of Computer Engineering (DCE), which has more than 30 employees, includ-ing five professors and 10 senior staff members. DCE is one of the strongest public sector entities in microelectronics and computer systems related RTD in Estonia. The department has a very well established international network of co-operation partners, which includes both the European major research insti-tutes and industrial enterprises. During the last 15 years, the department has participated in 11 EU-funded research projects. Among other projects, DCE is currently coordinator of the FP7 projects REGPOT-2009-229813 CREDES - Centre of Research Excellence in Dependable Embedded Systems and FP7-2009-

IST-4-248613 DIAMOND - Diagnosis, Error Modelling and Correction for Reliable Systems Design. DCE is also the co-ordinating partner of the national Centre of Research Excellence CEBE (Centre for Integrated Electronic Systems and Biomedical Engineering). On a national level DCE scientists have attracted also several Estonian Science Foundation grants, Estonian Research Council projects and industry-oriented projects.

The main areas of DCEs research are.• Mission critical embedded systems

(dependability, fault tolerance, test, diagnosis)

• Networks-on-Chip (NoC) modelling, analysis and test

• Systems-on-Chip and MPSoCs• CAD software • Applications towards biomedical

engineering (sensor networks, data analysis and interpretation, monitor-ing, diagnosis)

• Complex electronic boards: test strategy development, Boundary Scan test development, self-test pro-grams

During the last 5 years a strong accent has been given to the development of new CAD methods for design and test of network-on-chip based systems. The recent developments include system-level design environment for hard real-time NoC-based systems, novel test generation and diagnosis methods for complex digital circuits and novel archi-tectures for processing biomedical sig-nals.

More details about the department are available at http://ati.ttu.ee/english

Gert JervanDr. Gert Jervan is a senior research fellow at the DCE. He joined TTÜ in 2005 after receiving his Ph.D. degree from Linköping University, Sweden. His main research interests include fault tolerance, reliability and dependability, system-level design, and design opti-mization issues for novel submicron architectures. He belongs to the management com-mittee of the Estonian centre of excel-lence in research CEBE (Centre for Integrated Electronic Systems and Biomedical Engineering) and is a

ket SYSGO is addressing is the secu-rity market. Currently, PikeOS can host about ten different operating system APIs. Among them are Linux (ELinOS is a natural choice), POSIX, certified POSIX, OSEK, two different Java vir-tual machines, Ada and several popular RTOSes such as ANDROID, RTEMS or iTRON. PikeOS is certifiable to safety standards like DO-178B, IEC 61508, EN 50128, or ISO 26262, is MILS compli-ant, and is currently involved in various security standard CC EAL certification projects.

Research to sustain innovationSYSGO is currently participating to the FP7 European project SCARLETT, gath-ering 40 major players in the aerospace industry and working on defining the next generation of IMA (Integrated Modular Avionics). In addition to SCARLETT, SYSGO has been or still is participating in 3 other FP7 European funded projects (INTERESTED for safety-critical development tools integration, TECOM for trusted embedded com-puting, and JEOPARD for multi-core support) and 1 project, Verisoft XT, funded by the German Federal Ministry

of Education and Research (BMBF), all related to provide an embedded and real-time virtualization technology that satisfies the most stringent industry requirements in terms of safety, security and multi-core support.More European projects are currently starting. We can name two ARTEMIS projects which will require an AUTOSAR Personality: ACROSS and RECOMP projects.

Contact information:Dr. Sergey [email protected] n

13

Page 14: info27 · PDF file4419-6910-1 n Book on Low Power Networks-on-Chip HiPEAC Announce Cristina Silvano, Marcello Lajolo, Gianluca Palermo 3. info27 HiPEAC offers two Computer Systems

info27

New Member

coordinator of the EU FP7 REGPOT project CREDES (Centre of Research Excellence in Dependable Embedded Systems). Since 2006 he is a member of the Estonian Science Foundation’s Expert Commission for Physical Sciences and Engineering.He was a special sessions chair and one of the main organizers of the 2010 Diagnostic Services in Network-on-Chips workshop (co-located with DAC 2010). Recently he was serving as a vice program chair of Norchip 2008, general chair of the 19th EAEEEIE Annual Conference and was one of the organizers of the DATE 2008 Friday Workshop Impact of Process Variability on Design and Test. In 2011 he is the organizer of two special sessions: Power and Thermal Issues in 3D ICs at the IEEE International Workshop on Impact of Low Power Design on Test and Reliability (Trondheim, Norway) and Dependability in Multiprocessor and Reconfigurable SoCs at the 6th International Workshop on Reconfigurable Communication Centric Systems-on-Chip (Montpellier, France)

He belongs to the program commit-tees of several international confer-ences, has served as a guest editor of two journals, has published over 60 peer-reviewed papers and three book chapters.

Contact information:E-mail: [email protected]: http://www.ati.ttu.ee/~gerje

Thomas HollsteinProf. Dr.-Ing. Thomas Hollstein is Professor (Chair of Dependable Embedded Systems) in the Department of Computer Engineering at Tallinn University of Technology (TTU). Thomas Hollstein graduated from Darmstadt University of Technology in Electrical Engineering / Computer Engineering in 1991. In 1992 he joined the research group of the “Microelectronic Systems Lab” at Darmstadt University of Technology. He worked in several research projects in neural and fuzzy computing and industrial VHDL based design. Since 1995 he focused his research on hardware/software codesign and in 2000 he received his Ph.D. on “Design and interactive Hardware/Software Partitioning of complex heterogene-ous Systems” at Darmstadt University of Technology. From 2000 to 2010 he has been working as a senior researcher, lead-ing a research group focusing System-on-Chip communication architectures and integrated SoC test and debug methodologies. He has been an ini-tiator for a new research initiative in the field of printed electronics at TU Darmstadt starting in 2005 and is leading a research group for printed electronics and RFIDs since 2005.This initiative led to a joint university/industry research lab, the TUD/Merck lab.Since 2005 he is responsible for the ITG/VDE competence initiative “Fokusprojekt RFID” and he is the initiator and main responsible for the European Workshop “RFID Systems and Technologies” (RFID-SysTech).In Darmstadt he has been giving a large number of lectures on VLSI design and CAD methods. Based on this teaching experiences and together with a colleague of the TU Darmstadt’s “Institute of Printing Science and Technology”, he set up a new lecture “Printed Electronics”, which has been awarded with the “2010 FLEXI Award for Leadership in Education” of the FlexTech Alliance.

From 2001 until 2010 he has been member of a leader team initiating and establishing a new international master programme in “Information & Communication Engineering” at Darmstadt University of Technology.Since September 2010 Thomas Hollstein is a full professor at Tallinn University of Technology (Estonia) in the field of “Dependable Embedded Systems”. His established skills and research interests being a basis for contribu-tions to HiPEAC are in the following fields:• Dependable Embedded Systems

Design Techniques• System-on-Chip Design • Networks-on-Chip • MPSoCs (Programming Models,

APIs)• Reconfigurable Systems• In-System Dependability

Management, Debugging and Test

• Test of Circuits with parameter variations (Printed Electronics, Nanoelectronics)

Thomas Hollstein has published over 60 peer-reviewed papers and is on the programme committees of several international conferences and work-shops.

Contact information:E-Mail: [email protected]: http://www.pld.ttu.ee/~thomas/

n

14

Page 15: info27 · PDF file4419-6910-1 n Book on Low Power Networks-on-Chip HiPEAC Announce Cristina Silvano, Marcello Lajolo, Gianluca Palermo 3. info27 HiPEAC offers two Computer Systems

info27

HiPEAC Students

I am a PhD student at ALaRI, University of Lugano. I work within the EU FP7 project called MADNESS: Methods for predictAble Design of heterogeneous Embedded Systems with adaptivity and reliability Support (http://www.mad-nessproject.org). Thanks to the HiPEAC collaboration grant that I received last winter, I worked with Emanuele Cannella and Todor Stefanov at LIACS, University of Leiden on the implemen-tation of a task-aware middleware for fault-tolerant and adaptive KPN applica-tions on an FPGA-based NoC platform.

The complexity of multiprocessor systems on chip (MPSoCs) is rapidly increasing, driven by the technology improvement and the adoption of more and more complex applications in con-sumer electronics. Programming such complex systems at a low level of abstraction is extremely difficult and error-prone. A promising way to raise the level of abstraction is using mod-els of computation (MoCs) to specify applications. Among these MoC, Kahn Process Networks (KPNs) have been widely studied and used for streaming/multimedia applications. A favorable

feature of the KPN MoC is that its simple operational semantics allows for the easy adoption of system adaptivity mechanisms such as run-time resource management. System adaptivity is becoming increasingly important in the MPSoC domain for several reasons, such as dynamic variation of quality of service requirements, fault tolerance, or power efficiency.

Networks-on-Chip (NoCs) are emerg-ing communication infrastructures for MPSoCs that, among many other advantages, allow for system adaptivity. However, there is a mismatch between the generic structure of the NoCs and the semantics of the KPN MoC. Therefore, we investigated and pro-posed several approaches to overcome this mismatch. All of the proposed approaches consider system adaptiv-ity as a driving objective and do not require specific hardware support from the platform.

We proposed and implemented three middleware approaches, namely vir-tual connector, virtual connector with variable rate, and request-based, to execute Kahn Process Networks on

Network-on-Chip architectures. The key differences between these approaches are the rate of acknowledgement of empty slots in the remote queues and the buffering requirement in the sender side. Experimental results on two appli-cations (Sobel and MJPEG encoder) with very different computation and communication characteristics showed that the virtual connector approach outperforms the others in terms of the overhead in the total execution time when implementing communication-dominant applications. However, espe-cially for this kind of applications, the price we pay for system adaptivity and generality is large in terms of perform-ance, if compared to customized point-to-point systems. On the contrary, when the computation/communication ratio of an application is higher, as in the second case study, the overhead introduced by the execution on NoC with all the proposed middlewares is much lower. The overhead in terms of the amount of data transferred in the network is the least for request-based approach for both applications.

Onur Derin,ALaRI, University of Lugano n

Collaboration Grant Report - Onur Derin

PhD News

Synthesis and Exploration of Loop Accelerators for Systems-on-a-Chip

By Hritam Dutta ([email protected])Advisor: Prof. Jürgen TeichUniversity of Erlangen-Nuremberg, GermanyMarch 2011

Compelling next generation stream-ing applications, containing several computationally intensive nested loop programs, are the driving force for System-on-chip (SoC) architectures. SoC platforms are augmented with accelerators, which implement com-putationally intensive loop programs

of a given application with higher per-formance, lower power, and reduced cost due to specialized execution. The effort for programming and synthe-sizing loop accelerators for stream-ing applications is still enormous due to its low level nature. Therefore, a design methodology is presented in this thesis, which enables automated synthesis and exploration of hardware accelerators, described in a high-lev-el language. Central to our design flow is a sophisticated transformation, called hierarchical tiling, which assists

the designer in matching or specify-ing the degree of parallelism (number of PEs) and local memory, as well as requisite communication bandwidth of the accelerator architecture. The back end of our design flow automatically generates an RTL description of the accelerator PE datapath, the controller, and a memory interface, corresponding to the selected tiling and scheduling strategy. Communication synthesis for the data transfer and the synchroniza-tion between loop accelerators has been a major challenge. The complex-

15

Page 16: info27 · PDF file4419-6910-1 n Book on Low Power Networks-on-Chip HiPEAC Announce Cristina Silvano, Marcello Lajolo, Gianluca Palermo 3. info27 HiPEAC offers two Computer Systems

info27

By Jianjiang Ceng ([email protected])Advisor: Prof. Rainer LeupersRWTH Aachen University, GermanyApril 2011

Over the past few decades, our life-styles have been greatly changed by the various new electronic devices, which evolve at an incredible speed and continuously appear in the mar-

ket. For example, the first generation mobile phones emerged in the late 1970s were big and bulky; people could use them to just make calls. Today, their latest offspring, smart-

A Methodology for Efficient Multiprocessor System-on-Chip Software Development

Dependability-Aware System-Level Design for Embedded Systems

Universal Processor Architecture for Biomedical Implants: The SiMS Project

By Michael Glaß ([email protected])Advisor: Prof. Jürgen TeichUniversity of Erlangen-Nuremberg, GermanyMarch 2011

Embedded systems have become an integral part of our everyday life. Their constantly growing complexity requires for automatic design method-ologies that assist the system engineer

during the design process. In recent years, it could be observed that the ever-shrinking device structures are one of the main reasons for a grow-ing inherent unreliability of embedded system components. To cope with this problem, this work proposes a novel all-encompassing dependability-aware system-level design approach, tailored to the design of embedded systems. The main idea proposed here

features the introduction of depend-ability as a principal design objective into system-level design space explo-ration. At this juncture, this work pro-poses novel concepts for modeling, analysis, and optimization to enable the optimization of dependability together with several other design objectives to obtain dependable high-quality system implementations.

By Christos Strydis ([email protected])Advisors: Georgi Gaydadjiev, Stamatis VassiliadisTU Delft, The NetherlandsMarch 2011

Healthcare in the 21st century is changing rapidly. In the Western coun-tries, in particular, healthcare is mov-ing from a public to a more personal-ized approach. However, the costs of healthcare worldwide are rising every year. Better use of technology can and should be used to get these costs under control. At the same time, implants have clearly benefitted from the astounding technology-miniaturi-zation trends of late, boasting smaller sizes, lower power consumption and

increased performance of the transis-tor devices. However, such advances do not come for free. Adverse effects in current implant designs are being wit-nessed, such as increasing power con-sumption, absence of design for reli-ability and highly application-specific nature. Operating under the assump-tion that implants will constitute an important means towards improved, personal healthcare and, in view of the aforementioned design phenomena, we believe that a new paradigm in implant design is required. This dis-sertation establishes the concept of Smart implantable Medical Systems

(SiMS). SiMS is a systematic approach – a framework – for providing biomedi-cal researchers and, hopefully, industry

with a toolbox of ready-to-use, high-ly reliable implant sub-systems and models in order to rapidly construct optimal implants for various medi-cal applications. The SiMS framework has to guarantee essential attributes, such as high dependability, modular design, ultra-low power consumption and miniature size. Having defined the SiMS framework, this dissertation is, then, concerned with exploring the optimal microarchitectural details of a crucial SiMS component: the SiMS processor. Contrary to the current state of the art, this processor aspires to be a new universal, low-power and low-cost processor, capable of efficiently serving a wide range of diverse (current and future) implant applications.

ity of the problem arises from the fact that an optimal memory mapping and address generation in a communication subsystem for parallel data access and out-of-order communication depends on the dependencies between loops and allocation/scheduling choices. An architecture template, a methodology

based on the polyhedral model, is pro-posed for generation of the commu-nication primitive. The selection of an optimal architecture can be daunting due to a plethora of architecture and compiler design decisions. Therefore, a method using modern search heuristics based on evolutionary algorithms and

estimation of objectives is proposed, to identify Pareto-optimal designs in terms of area cost, power consumption, and performance. Several benchmarks from Berkeley dwarfs demonstrate the valid-ity and promise of the design method-ology for accelerator generation based on the polyhedral model.

PhD News

16

Page 17: info27 · PDF file4419-6910-1 n Book on Low Power Networks-on-Chip HiPEAC Announce Cristina Silvano, Marcello Lajolo, Gianluca Palermo 3. info27 HiPEAC offers two Computer Systems

info27

PhD News

phones, are not only small and light, but also feature rich. Functionalities supported by smartphones, include, but are not limited to, telephony, text messaging, video playback, gam-ing, Internet surfing and, navigation. Inside smartphones and many other similar entertainment/infotainment devices, Multiprocessor System-on-Chips (MPSoCs) are often found. By integrating multiple processing elements each optimized for a spe-cific application or an application domain, MPSoCs strike a good bal-ance between performance, energy efficiency and manufacturing cost. Therefore, MPSoC has become the standard solution for designing devic-es which demand high computational power and low energy consumption. However, aside from the success of

the MPSoC hardware, MPSoC soft-ware developers have a hard time with its programming. Due to the use of multiple processors which are often heterogeneous, MPSoCs pose much more challenges for their pro-grammers than the traditional single processor does.

MAPS (MPSoC Application Programming Studio) is a software development environment, which provides developers with an efficient MPSoC programming solution. The major contribution of MAPS is its sup-port for MPSoC programming with a systematic development process. To achieve this, MAPS features a set of tools, each of which takes care of part of the whole process. This thesis focuses on the design and the imple-

mentation of the MAPS profiling tool, the MAPS partitioning tool and the MAPS high-level simulation tool. The MAPS profiling tool helps the MPSoC programmer to better understand the target application and gathers runt-ime information for further analysis; the MAPS partitioning tool helps the developer to split sequential applica-tions into tasks for parallel execution; and the MAPS high-level simulation tool provides an early test environ-ment for the developer to perform functional simulation of MPSoC appli-cations and to estimate their perform-ance trend without the need for a hardware prototype or an instruction-set-simulation based virtual platform. The efficiency of the MAPS frame-work is shown through its usage on multiple MPSoC platforms.

Energy-aware Scheduling for Multiprocessor Real-time Systems

By Muhammad Khurram Bhatti ([email protected])Advisors: Prof. Cécile Belleudy, Prof. Michel AuguinUniversity of Nice-Sophia Antipolis-CNRS, Nice, France.April 2011

Nowadays, real-time applications have become sophisticated and complex in their behavior. Contemporaneously, the emergence of multi-core technology has brought a paradigm shift for the research on real-time embedded sys-tems. As the computational demands of real-time applications continue to grow, energy optimization techniques will become increasingly important. Thus, among many challenges faced by real-time research community, power and energy management has become a first-class design consideration. Our

dissertation has addressed this problem in multiprocessor (soft and hard) real-time systems from the scheduling per-spective. Our first contribution, called Assertive DPM (AsDPM) technique, serves as an admission control tech-nique for real-time tasks that decides when exactly a ready task shall exe-cute, thereby minimizing the number of active processors at every scheduling event. AsDPM accumulates idle time intervals on certain processors within the system, which are then transitioned into sleep-states to reduce energy con-sumption. AsDPM does not make pre-dictions on the length of idle intervals. Our second contribution, called Hybrid Power Management (HyPowMan) technique, is based on an online and adaptive interplay of DPM and DVFS techniques. Early research reports that both DPM and DVFS techniques can

outperform each other whenever there is a change in target application, archi-tecture configuration, or scheduling algorithm. Thus, no single policy fits perfectly in most operating conditions. Instead of designing new energy man-agement policies to target specific oper-ating conditions, HyPowMan takes a set of well-known existing policies, each of which performs well for a given set of conditions, and proposes a machine-learning mechanism to adapt at runt-ime to the best-performing policy for given workload. Our simulations with H.264 video decoder application show that AsDPM can achieve energy savings within 16% of ideal DPM solution and HyPowMan always converges to the best-performing policy within a policy set composed of four policies, thereby achieving energy savings between 18% to 47%.

By Manuel Comparetti ([email protected])Advisors: Prof. G. Alia, Prof. C.A. Prete, Prof P. FogliaUniversità di Pisa, Italy, May 2011

Large last level caches are a common design choice for today’s high perform-ance microprocessors, but ever shrink-ing feature size and high clock frequen-cies exacerbate the wire delay problem:

wires don’t scale as transistors, and tend to dominate the overall cache latency. Further, leakage power is becoming the main power issue for large LLC caches in deep sub-micron processes.

Leakage Reduction Alternatives For Low Power D-Nuca Caches

17

Page 18: info27 · PDF file4419-6910-1 n Book on Low Power Networks-on-Chip HiPEAC Announce Cristina Silvano, Marcello Lajolo, Gianluca Palermo 3. info27 HiPEAC offers two Computer Systems

info27

PhD News

Custom Architecture for Immersive-Audio ApplicationsBy Dimitris Theodoropoulos ([email protected])Advisors: Henk Sips, Georgi K. Kuzmanov, Georgi GaydadjievTU Delft, The NetherlandsMay 2011

In this dissertation, we propose a new approach for rapid develop-ment of multi-core immersive-audio systems. We study two popular immersive-audio techniques, namely the Beamforming and the Wave Field Synthesis (WFS). Beamforming utilizes microphone arrays to extract acoustic sources recorded in a noisy environ-ment. WFS employs large loudspeaker arrays to render moving audio sourc-es, thus providing outstanding audio perception and localization. Research on literature reveals that the majority of such experimental and commercial audio systems are based on standard

PCs, due to their high-level program-ming support and potential of rapid system development. However, these approaches introduce performance bottlenecks, excessive power con-sumption and increased overall cost. Systems based on DSPs consume very low power, but performance is still limited. Custom-hardware solutions alleviate the aforementioned draw-backs, but designers primarily focus on performance optimization without providing a high-level interface for sys-tem control and test. To address the aforementioned problems, we pro-pose a custom platform-independent architecture that supports immersive-audio technologies for high-quality sound acquisition and rendering. An important feature of the architecture is that it is based on a multi-core processing paradigm. This allows the design of scalable and reconfigurable

micro-architectures, with respect to the available hardware resources, and customizable implementations target-ing multi-core platforms. To evalu-ate our proposal we conducted two case studies: We implemented our architecture as a heterogeneous multi-core reconfigurable processor mapped onto FPGAs. Furthermore, we applied our architecture to a wide range of contemporary GPUs. Our approach combines the software flexibility of GPPs with the computational power of multi-core platforms. Results sug-gest that employing GPUs and FPGAs for building immersive-audio systems, leads to solutions that can achieve up to an order of magnitude improved performance and reduced power con-sumption, while also decrease the overall system cost, when compared to GPP-based approaches.

Modelling Embedded Applications for On-Chip Multiprocessing Platforms

By Sanna Määttä ([email protected])Advisor: Prof. Jari NurmiTU Tampere, FinlandJune 2011

The complexity of state-of-the-art embedded systems requires design-ers to focus on abstraction levels much higher than Register Transfer Level (RTL). As the designers are used

to RTL, system design often starts at levels of abstraction that are too close to implementation. Higher lev-els of abstraction substantially reduce the amount of details designers need

NUCA (Non Uniform Cache Architectures) caches limit the impact of wire delay on performances by aggressively partitioning the cache into independently accessible small and fast sub-banks, interconnected by a scalable network-on-chip. D-NUCA caches, in particular, implement a migration mechanism on frequently accessed data by dynamically mov-ing them into the banks closer to the controller, hence accessed faster. By leveraging their modularity and the non uniform distribution of data it is possible to apply to D-NUCA caches a straightforward leakage reduction technique known as Way-Adaptable D-NUCA. This purely microarchitectur-al technique dynamically adapts the

cache size to the needs of the running workloads by applying power gating with bank granularity, hence with very low design overhead. An entire group (a way) of banks is powered off or on by a simple reconfiguration algorithm in the course of the application.

In this thesis the effectiveness of this technique in a multi-programmed environment is investigated, show-ing its usefulness for a CMP system. It is also shown as, interestingly, the migration mechanism permits an effi-cient utilization of the fastest ways of the NUCA cache by both processors. The Way-Adaptable technique is then compared to other leakage reduction techniques applied to D-NUCA L2

caches such as Drowsy Caching and Decay Line Caching. By gating entire groups of banks the Way-Adaptable D-NUCA technique addresses the leakage power consumption of net-work routers and bank peripheral circuitry, unlike other leakage reduc-tion techniques focused on SRAM cells. Further, for deep sub-micron processes, the effectiveness of Drowsy Caching is limited by statistical process variation. We show how it is possible to combine the Way-Adaptable tech-nique with Drowsy Cache in order to cope with process-dependent limita-tions of leakage reduction techniques for deep sub-micron D-NUCA caches.

18

Page 19: info27 · PDF file4419-6910-1 n Book on Low Power Networks-on-Chip HiPEAC Announce Cristina Silvano, Marcello Lajolo, Gianluca Palermo 3. info27 HiPEAC offers two Computer Systems

info27

PhD News

systems, better understand the sys-tem under development, visualise a system, specify the structure and behaviour of the system, validate the system behaviour, and docu-ment the design decisions. Moreover, modelling reduces development time and costs. This thesis describes a model-based approach for embedded

By Laiq Hasan ([email protected])Advisors: Prof. H. J. Sips, Dr. Z. Al-ArsTU Delft, The NetherlandsJune 2011

Biological sequence alignment is an important and challenging task in bioinformatics. Alignment may be defined as an arrangement of two or more DNA or protein sequences to highlight the regions of their simi-larity. Sequence alignment is used

to infer the evolutionary relationship between a set of protein or DNA sequences. An accurate alignment can provide valuable information for experimentation on the newly found sequences. It is indispensable in basic research as well as in practical appli-cations such as pharmaceutical devel-opment, drug discovery, disease pre-vention and criminal forensics.

Many algorithms and methods, such as, dot plot, Needleman-Wunsch,

Hardware Acceleration of Bioinformatics Sequence Alignment ApplicationsSmith-Waterman, FASTA, BLAST, HMMER and ClustalW have been proposed to perform and acceler-ate sequence alignment activities. However, with the ever increasing volume of data in bioinformatics data-bases, the time needed for biological sequence alignment is always increas-ing. The main aim of the research presented in this thesis is to explore and analyze the existing sequence alignment methods and come up with better and optimized solutions.

to consider enabling complex sys-tem design in shorter time. Modelling and simulation are essential methods in state-of-the-art embedded system design. In model-based design, a sys-tem model is the key element of the design process from the specifica-tion to the implementation. Modelling helps designers to manage complex

application modeling and validation together with an on-chip multiproc-essing platform. The aim of the work was to facilitate the programming of multiprocessing systems as well as to enable early system validation, design space exploration, and performance evaluation.

By Arnaldo Pereira de Azevedo Filho ([email protected])Advisor: Prof. Ben JuurlinkTU Delft, the NetherlandsJune 2011

This thesis presents methodologies and evaluations aiming at increasing the efficiency of video coding appli-cations for heterogeneous many-core processors composed of SIMD-only, scratchpad memory based cores. First, we present the 3D-Wave paral-lelization strategy for video decoding that scales for many-core processors. It is based on the observation that dependencies between frames are related with the motion compensation kernel and motion vectors are usually within a small range. The 3D-Wave strategy combines macroblock-level

parallelism with frame- and slice-level parallelism by overlapping the decoding of frames while dynamically managing macroblock dependencies. The 3D-Wave was implemented and evaluated in a simulated many-core embedded processor consisting of 64 cores. Policies for reducing memory footprint and latency are presented. The effects of memory latency, cache size, and synchronization latency are studied.

Next we assess SIMD-only cores for the increasing complexity of current multimedia kernels. We evaluate the suitability of SIMD-only cores for the increasing divergent branching in video processing algorithms. The H.264 Deblocking Filter is used as test case. Also, the overhead imposed by

Efficient Execution of Video Applications on Heterogeneous Multi- and Many-Core Processors

the lack of a scalar processing unit for SIMD-only cores is measured using two methodologies. Low area over-head solutions are proposed to add scalar support to SIMD-only cores. Finally, we focus on the memory hier-archy and we propose a new software cache organization to increase the efficiency and efficacy of scratchpad memories for unpredictable and indi-rect memory accesses. The proposed Multidimensional Software Cache reduces software cache overhead by allowing the programmer to exploit known access behavior in order to reduce the number of accesses to the software cache and by grouping memory requests. An instruction to accelerate MDSC lookup is also pre-sented and analyzed.

19

Page 20: info27 · PDF file4419-6910-1 n Book on Low Power Networks-on-Chip HiPEAC Announce Cristina Silvano, Marcello Lajolo, Gianluca Palermo 3. info27 HiPEAC offers two Computer Systems

info27

HiPEAC Info is a quarterly newsletter published by the HiPEAC Network of Excellence, funded by the 7th European Framework Programme (FP7) under contract no. IST-217068.Website: http://www.HiPEAC.netSubscriptions: http://www.HiPEAC.net/newsletter

The 7th International Conference on High Performance and Embedded Architectures and Compilers (HiPEAC 2012), 23-25 January 2012 , Paris, France, http://www.hipeac.net/conference

ContributionsIf you are a HiPEAC member and would like to contribute to future HiPEAC newsletters,

please contact Rainer Leupers at [email protected]

The 20th International Conference on Parallel Architectures and Compilation Techniques (PACT), 10 October, 2011, Galveston Island, Texas, USA, http://www.pactconf.org/

The 29th IEEE International Conference on Computer Design (ICCD 2011), 9-12 October, 2011, Amherst, USA, http://iccd-conference.com

The 9th IEEE/IFIP International Conference on Embedded and Ubiquitous Computing (EUC 2011), 24-26 October, 2011, Melbourne, Australia, http://anss.org.au/euc2011,

International Symposium on System-on-Chip (SoC 2011), 1-2 November 2011, Tampere, Finland, http://soc.cs.tut.fi/

Barcelona Multi-Core Workshop (BMW 2011), 2 November 2011, Barcelona, Spain, http://www.bscmsrc.eu/media/events/barcelona-multicore-workshop-201

Conference on Design and Architectures for Signal and Image Processing (DASIP 2011),2-4 November 2011, Tampere, Finland, http://www.ecsi.org/dasip/

The 2nd International Congress on Computer Applications and Computational Science (CACS 2011), 15-17 November 2011, Bali, Indonesia, http://irast.net/conferences/CACS/2011

The 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-44),3-7 December, 2011, Porto Alegre, Brazil, http://www.microarch.org/micro44

The 17th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2012), 3-7 March, 2012, London, UK, http://research.microsoft.com/asplos_2012

The 15th Design, Automation and Test in Europe Conference (DATE 2012), 12-16 March 2012, Dresden, Germany, http://www.date-conference.com/

The International Conference on Compiler Construction (CC 2012)24 March - 1 April 2012, Tallinn, Estonia, http://conferences.inf.ed.ac.uk/cc2012

The 9th European Dependable Computing Conference (EDCC-2012)8-11 May 2012, Sibiu, Romania, http://edcc.dependability.org/

20

Upcoming Events