LCU14 303- Toolchain Collaboration

Preview:

DESCRIPTION

★ Session Summary ★ This session will be working through the planned open source contributions from Linaro, ARM, and other members who want to share their open source contribution plans for the next year. Projects to be included are: gcc, llvm, glibc, gold, gdb, binutils. --------------------------------------------------- ★ Resources ★ Zerista: http://lcu14.zerista.com/event/member/137749 Google Event: https://plus.google.com/u/0/events/c3knobs1t2fd2vi9f9mhejehts0 Video: https://www.youtube.com/watch?v=b-mtKxOm0m8&list=UUIVqQKxCyQLJS6xvSmfndLA Etherpad: http://pad.linaro.org/p/lcu14-303 --------------------------------------------------- ★ Event Details ★ Linaro Connect USA - #LCU14 September 15-19th, 2014 Hyatt Regency San Francisco Airport --------------------------------------------------- http://www.linaro.org http://connect.linaro.org

Citation preview

LCU14 BURLINGAME

Ryan Arnold, LCU14

LCU14-303: Toolchain Collaboration

● Participants● Linaro● ARM● QuIC● Cavium● ST

● Topics● Participant Introductions and Development Focus● GNU Toolchain Roadmaps● GNU Toolchain Specifics● LLVM Roadmaps● LLVM Specifics● System Libraries, Linkers, Debuggers, and Tools

Toolchain Collaboration For The Next 6 Months

● Representation● Ryan Arnold - Engineering Manager● Maxim Kuvyrkov - Tech Lead● Team - 6 Linaro employees and 6 member assignees

● Kugan Vivekenandarajah, Venkataramanan Kumar, Bernie Ogden, Omair Javaid, Will

Newton, Rob Savoye, Michael Collison, Christophe Lyon, Charles Baylis, Yvan Roux, Renato

Golin, Wang Deqiang

● Purpose● Improve Collaboration● Eliminate Roadmap Redundancy● Identify gaps in eco-system

Linaro - Introduction & Purpose

● Product Validation Framework Improvements● Backport, Release, and Binary Toolchain validation automation and reporting

● Toolchain Performance● GCC and LLVM Performance

● Benchmark Automation● Backport, Release, and Binary Toolchain benchmark automation and reporting

● Product offering expansions in 2015● x86_64 hosted cross toolchains● Aarch32 targeted cross toolchains● ARMv7 and ARMv8 hosted toolchains

Linaro - Focus from LCA14 into 2015

PUBLIC

Open Source Core ToolchainsThe Next Six Months

Matthew Gretton-DannAugust 2014

PUBLIC

▪ Tell you what ARM plans to work on, and what its current priorities are▪ However, things are likely to change – so:▪ We do not promise to achieve all of this in the next six months; nor▪ Do we promise not to do other work

▪ If your plans include the same topics, or work in the same areas▪ Come and talk to us – we should work together▪ Preferably this conversation should happen in the appropriate upstream communities.

▪ If you feel that we’re doing the wrong thing▪ Come and talk to us – we’re happy to work out a better way forward

▪ We are moving to tracking all our ‘public’ work in the appropriate community Bugzilla databases.

▪ This is the best place to have the conversation about best ways forward.

Purpose of this Presentation

PUBLIC

▪ Support the Architecture & Cores▪ Teams are involved in development of new cores and architecture extensions▪ We will not discuss those here▪ However, we plan to upstream functionality as soon as possible after public announcements

▪ Support the Community

▪ Improve Performance:▪ Focus on Cortex-A57 performance improvement▪ Focus on a range of benchmarks, including industry standard CPU benchmarks.▪ We analyze benchmarks both:

◦ for improvements we can make to the toolchains; and◦ to note any regressions and get them fixed in co-operation with the community

Overview of Goals for Year

ST - Introduction & Purpose

ST - Focus from LCU14 into 2015

QuIC - Introduction & Purpose

QuIC - Focus from LCU14 into 2015

● Supports GNU based ThunderX toolchain internally (and other Cavium products)

● Make sure that GCC performance areas are covered but not twice● Implemented ILP32 support in the kernel, glibc and parts of gcc

and binutils support● Helped in getting some performance improvements for AARCH64

already○ Naveen implemented many patterns in the back-end for the instructions which were not

being emitted○ Andrew helped with part of conditional compares; improving ifcombine○ Added issue rate to the AARCH64 cost table○ Added trap pattern so abort function is not used for __builtin_trap○ Removed some redundant cmp’s

● Added many new testcases

Cavium - Introduction

● Finish upstreaming ILP32 support○ Including gdb and glibc support○ glibc patch is almost done, just finalizing the patch set

● Upstream base ThunderX support○ Will not include a schedule model to begin with

● Upstreaming patches for GCC 6 stage 1○ Conditional moves improvements○ Improvements to conditional compares○ Large system extension support in GCC

■ Joel posted an infrastructure change that was rejected; might need to rewrite them○ LSE HWCAP support in glibc and kernel

■ Need to know what path is acceptable for glibc○ Some tweaks to the cost tables in AARCH64; needed for ThunderX support

● Looking into prefetch loop arrays

Cavium - Focus from LCU14 into 2015

GNU Toolchain Collaboration

● Continue Member Driven Optimizationscurrent examples in development:● Zero-sign-extension elimination using value-range propagation● NEON intrinsics improvements in Libvpx on ARM & Aarch64● STREAMS performance improvements

● Identify Linaro Toolchain product driven optimizations● Benchmarking Linaro toolchain products● Identifying Regressions● Improving performance based on investigations

● Performance Comparisons● Identify potential optimizations based on performance gains seen on other

architectures.● Future

● Whole System Profiling & Workload Profiling● Feature exploitation

● LTO for Aarch64

Linaro - What’s Next for GCC Performance?

● Improve NEON testing coverage and correctness● GCC community stewardship

● bug triage● patch review

● Unified Driver Development● LLVM Community Releases

Linaro - Community Involvement

● Improve validation of Linaro GCC source package backports● Improve automation● Add default configurations validated per backport: 8 17

● Provide expansive source release validation of existing products● all default configurations● all enabled secondary configurations● all supported languages● various tunings

● Offer new products● arm and aarch64 native binary toolchains● x86_64 hosted cross toolchains● Aarch32 targeted cross toolchains

Linaro - What’s Next for Product Offerings?

● Release Candidate Benchmarking○ Current Release Benchmarking

■ Manual SPEC2K (looking for release regressions)○ Future Release Benchmarking

■ Automated SPEC2K, SPEC2K6, EEMBC Suite● Backport Validation Benchmarking

○ Current Backport Benchmarking■ None

○ Future Backport Benchmarking■ Automated Coremark in development

● Reporting - uploading permitted relative results to members only portal● Why does Linaro do benchmarking?

○ Guides future development○ Informs validity of patches in development

■ Current Development Benchmarking● as-needed: Coremark, SPEC2K, SPEC2K6

Linaro - What’s next for Benchmarking?

PUBLIC

GNU Roadmap : Cortex-AM

OBI

LE

ENTE

RPRI

SE

CO

MM

ON

2014 FutureH1 2015

Released

Development

Adv. Planning

Concept

ARMv8 A32 - ISA extension

Cortex-A12 - Arch support

A64 toolchain production ready - GCC 4.9

Cortex-A12/A17 - uArch tuning, cost model

A64 performance gains

ACLE 64 - Specification

Cortex-A57 - uArch tuning, cost model

ILP32- User space & production

ACLE 64- Implementation

Big Endian- AArch64 auto-vectorization

Maths libraries

A64 GOLD

Big Endian – Basic AArch64 support

Performance optimization - CPU-centric performance enhancements

Toolchain features - Continuous ecosystem contribution for performance and features, NEON intrinsics

GCC 4.9

GCC 4.10 / 5.0

A7/A15 A32 big.LITTLE

PUBLIC

▪ Reworked AArch64 RTX costs ▪ Improved Neon intrinsics code generation▪ PUSH_ARGS_REVERSED improvements.▪ GLIBC math library improvements for AArch32 and AArch64 ▪ Improved code generation for copysign intrinsic ▪ Improved choice of spill size for FP registers (decreasing memory bandwidth)▪ Restructured and improved prologue/epilogue sequences – especially with –fomit-

frame-pointer.▪ Improved addressing modes for vectors on AArch64▪ Improve AArch32 memset inlining

What We’ve Done In the Past Three Months or SoGNU Toolchain

PUBLIC

▪ General bug fixes and maintenance▪ Enable shrink-wrapping for AArch64 (GNUTOOLS-2476)▪ Investigate and initial RFC for better load store pair generation (GNUTOOLS-154)▪ Improved bit field handling instructions (GNUTOOLS-197)▪ Big Endian AArch64 fixes (Focused on SIMD and vectorisation correctness)▪ Improved Register move costs (GNUTOOLS-4528)▪ Misc performance improvements based on scheduler / backend tweaks (GNUTOOLS-4317,

GNUTOOLS-4508)▪ Improved csinc / csneg generation (GNUTOOLS-4335)▪ Conditional compares ▪ Core tuning: Cortex-A57, Cortex-A12 and Cortex-A17▪ IVOpts improvements▪ Memcpy for AArch64 – inlining and improved alignment

What’s NextGCC – Things to do before Stage 1 closes (mid-October 2014)

PUBLIC

▪ Stage 3▪ Bug fixes/Regression fixes.▪ Improved conformance and performance for Advanced SIMD Intrinsics.

▪ Stage 4▪ Regression fixes.▪ Help community get GCC 5.0 released.

What’s NextGCC – During Stage 3 (October – December 2014) and Stage 4 (Early 2015)

PUBLIC

▪ Maintenance▪ Support the architectural roadmap▪ Help community get Binutils 2.25 released.

What’s NextBinutils & GDB

QuIC - GNU Toolchain Roadmap

QuIC - GNU Toolchain Details

ST - GNU Toolchain Roadmap

ST - GNU Toolchain Details

LLVM Collaboration

● Become the compiler of choice for all Qualcomm processor cores● Today LLVM is the compiler of choice for DSP and GPU● Would like to see LLVM reach that level acceptance for CPU before the end of 2015

● Realize the full benefits of code hygiene on ARM from LLVM’s family of projects, i.e., sanitizers.

QuIC - Goals for LLVM

● Collaborated with ARM on initial Aarch64 backend● Worked with the community on the ARM64/Aarch64 merge

● CortexA53 machine description● CortexA57 machine description

● Contributed initial Aarch64 ELF support to lld● ASAN bug fixes

QuIC - What has QuIC done with LLVM

● Continue weekly collaborate with ARM on performance optimizations, particularly Aarch64.

● Greedy inliner● PGO● Incremental use of sanitizers

QuIC - What QuIC will be working on

● Community Maintainership● LLVM 3.5 and LLVM 3.6 release maintainership

● Support● LLVM Kernel initiative, Android bugs, buildbots, member support

● LLVM Toolchain Stability● Assembler, compiler libraries, linker, tools, libc++

● LLVM Performance● Benchmarking & Profiling● Comparing against GCC/x86● Performance parity of 32-bit vs. 64-bit

● Sanitizers - might be covered under GCC development plan

● LLVM Linker● LLVM Integration on Android for Aarch64

Linaro - What’s Next For LLVM in Linaro?

current staff coverage line

PUBLIC

MO

BIL

EEN

TERP

RIS

EC

OM

MO

N

2014 FutureH1 2015

Released

Development

Adv. Planning

Concept

LLVM 3.4

LLVM 3.5

LLVM 3.6

LLVM Roadmap : Cortex-A

v8 NEON - AArch64

Big Endian - Basic

Benchmarking infrastructure – Public performance tracking buildbot

L

libc++ buildbotInitial Autovectorization

L

AArch32 buildbot

Cortex-A53 - uArch tuning

L

AArch64 and Cortex-A57 - Performance tuningARM64 / AArch64 backend merge

PUBLIC

▪ Completion of the ARM64 and AArch64 backend merge▪ Performance improvements:

▪ Improve code generation for converting in-memory 16-bit integer to 64-bit float (LLVM-1508)▪ Optimistically use ‘sqrt’ instruction where available, and only fall back to a library call in the

presence of NaNs (LLVM-1509)▪ Reduce spilling of Q registers (LLVM-1538)▪ Improve code selection between conditional instructions and branches. (LLVM-1489)▪ A57 Fused multiply tuning (LLVM-1610)▪ Improve Global Value Numbering (LLVM-1612)

▪ Re-engineering of ARM Neon intrinsic support▪ Big Endian Support - AArch32 & initial AArch64 support▪ Stack size reduction patches – some work still to do.

What We’ve Done In the Past Three Months or SoLLVM Toolchain

PUBLIC

▪ Inline parameter tuning (LLVM-1500)▪ Improve spilling heuristics (LLVM-1524, LLVM-1504, LLVM-1586)▪ Common expression hoisting (LLVM-1247, LLVM-1490, LLVM-1550)▪ TBNZ and CBNZ optimization (LLVM-1575)▪ Register coalesce and rematerialization (LLVM-1582)▪ Redundant common comparison expressions (LLVM-1491)▪ Loop induction variable selection (LLVM-1492)▪ Remove redundant stores▪ Improved usage of vectorization opportunites using structs (LLVM-1501)▪ Reduce xzr assignment on cbz target (LLVM-1583)

What’s NextLLVM – Performance

PUBLIC

▪ Global variable store should be hoisted (LLVM-1493)▪ Too many MOVs on function call boundaries (LLVM-1504)▪ Optimise LDR, LDRSW sequence into LDR, SXTW (LLVM-1581)▪ Tune loop unrolling (LLVM-1587, LLVM-1590, LLVM-1646)

What’s NextLLVM – Performance

PUBLIC

▪ Buildbots & benchmarking infrastructure▪ Plan to setup a public performance tracking bot on Juno-A57▪ To be publicly visible, maintained, and continuously producing performance numbers▪ Running the LLVM LNT test-suite as a benchmark

▪ Various bug fixes and improvements▪ Focus on ARMv8-A, ARMv7, and ARMv6-M.

▪ Support for selected ACLE (non Neon) intrinsics

What’s NextLLVM - Other

ST - LLVM Toolchain Roadmap

ST - LLVM Toolchain Details

System Libraries, Tools, Debuggers Collaboration

● System Libraries● malloc benchmarking● malloc improvements● string and memory function optimizations for arm-linux-gnueabihf● Linaro GDB and glibc source package releases with backported optimizations

● GDB● Finish GDB on Android for ARMv8 support - catchpoints● Aarch32/Aarch64 completeness - test-suite parity● Aarch32 mix-mode debugging (thumb and arm modes)

Linaro - What’s next for system libs & dev tools?

PUBLIC

▪ String routine improvements▪ Maintenance activities. ▪ Help community get 2.21 released.

What’s NextGlibc – up to 2.21 release (end of 2014)

PUBLIC

▪ Linkers: LLD & Gold▪ Libc++▪ Sanitizers▪ ILP32

What We Are Not Currently DoingBut Are Interested In…

QuIC - Libraries, Linkers, Debuggers, Tools

ST - Libraries, Linkers, Debuggers, Tools

More about Linaro Connect: connect.linaro.org Linaro members: www.linaro.org/membersMore about Linaro: www.linaro.org/about/