48
LCU14 BURLINGAME Ryan Arnold, LCU14 LCU14-303: Toolchain Collaboration

LCU14 303- Toolchain Collaboration

  • Upload
    linaro

  • View
    214

  • Download
    3

Embed Size (px)

DESCRIPTION

★ Session Summary ★ This session will be working through the planned open source contributions from Linaro, ARM, and other members who want to share their open source contribution plans for the next year. Projects to be included are: gcc, llvm, glibc, gold, gdb, binutils. --------------------------------------------------- ★ Resources ★ Zerista: http://lcu14.zerista.com/event/member/137749 Google Event: https://plus.google.com/u/0/events/c3knobs1t2fd2vi9f9mhejehts0 Video: https://www.youtube.com/watch?v=b-mtKxOm0m8&list=UUIVqQKxCyQLJS6xvSmfndLA Etherpad: http://pad.linaro.org/p/lcu14-303 --------------------------------------------------- ★ Event Details ★ Linaro Connect USA - #LCU14 September 15-19th, 2014 Hyatt Regency San Francisco Airport --------------------------------------------------- http://www.linaro.org http://connect.linaro.org

Citation preview

Page 1: LCU14 303- Toolchain Collaboration

LCU14 BURLINGAME

Ryan Arnold, LCU14

LCU14-303: Toolchain Collaboration

Page 2: LCU14 303- Toolchain Collaboration

● Participants● Linaro● ARM● QuIC● Cavium● ST

● Topics● Participant Introductions and Development Focus● GNU Toolchain Roadmaps● GNU Toolchain Specifics● LLVM Roadmaps● LLVM Specifics● System Libraries, Linkers, Debuggers, and Tools

Toolchain Collaboration For The Next 6 Months

Page 3: LCU14 303- Toolchain Collaboration

● Representation● Ryan Arnold - Engineering Manager● Maxim Kuvyrkov - Tech Lead● Team - 6 Linaro employees and 6 member assignees

● Kugan Vivekenandarajah, Venkataramanan Kumar, Bernie Ogden, Omair Javaid, Will

Newton, Rob Savoye, Michael Collison, Christophe Lyon, Charles Baylis, Yvan Roux, Renato

Golin, Wang Deqiang

● Purpose● Improve Collaboration● Eliminate Roadmap Redundancy● Identify gaps in eco-system

Linaro - Introduction & Purpose

Page 4: LCU14 303- Toolchain Collaboration

● Product Validation Framework Improvements● Backport, Release, and Binary Toolchain validation automation and reporting

● Toolchain Performance● GCC and LLVM Performance

● Benchmark Automation● Backport, Release, and Binary Toolchain benchmark automation and reporting

● Product offering expansions in 2015● x86_64 hosted cross toolchains● Aarch32 targeted cross toolchains● ARMv7 and ARMv8 hosted toolchains

Linaro - Focus from LCA14 into 2015

Page 5: LCU14 303- Toolchain Collaboration

PUBLIC

Open Source Core ToolchainsThe Next Six Months

Matthew Gretton-DannAugust 2014

Page 6: LCU14 303- Toolchain Collaboration

PUBLIC

▪ Tell you what ARM plans to work on, and what its current priorities are▪ However, things are likely to change – so:▪ We do not promise to achieve all of this in the next six months; nor▪ Do we promise not to do other work

▪ If your plans include the same topics, or work in the same areas▪ Come and talk to us – we should work together▪ Preferably this conversation should happen in the appropriate upstream communities.

▪ If you feel that we’re doing the wrong thing▪ Come and talk to us – we’re happy to work out a better way forward

▪ We are moving to tracking all our ‘public’ work in the appropriate community Bugzilla databases.

▪ This is the best place to have the conversation about best ways forward.

Purpose of this Presentation

Page 7: LCU14 303- Toolchain Collaboration

PUBLIC

▪ Support the Architecture & Cores▪ Teams are involved in development of new cores and architecture extensions▪ We will not discuss those here▪ However, we plan to upstream functionality as soon as possible after public announcements

▪ Support the Community

▪ Improve Performance:▪ Focus on Cortex-A57 performance improvement▪ Focus on a range of benchmarks, including industry standard CPU benchmarks.▪ We analyze benchmarks both:

◦ for improvements we can make to the toolchains; and◦ to note any regressions and get them fixed in co-operation with the community

Overview of Goals for Year

Page 8: LCU14 303- Toolchain Collaboration

ST - Introduction & Purpose

Page 9: LCU14 303- Toolchain Collaboration

ST - Focus from LCU14 into 2015

Page 10: LCU14 303- Toolchain Collaboration

QuIC - Introduction & Purpose

Page 11: LCU14 303- Toolchain Collaboration

QuIC - Focus from LCU14 into 2015

Page 12: LCU14 303- Toolchain Collaboration

● Supports GNU based ThunderX toolchain internally (and other Cavium products)

● Make sure that GCC performance areas are covered but not twice● Implemented ILP32 support in the kernel, glibc and parts of gcc

and binutils support● Helped in getting some performance improvements for AARCH64

already○ Naveen implemented many patterns in the back-end for the instructions which were not

being emitted○ Andrew helped with part of conditional compares; improving ifcombine○ Added issue rate to the AARCH64 cost table○ Added trap pattern so abort function is not used for __builtin_trap○ Removed some redundant cmp’s

● Added many new testcases

Cavium - Introduction

Page 13: LCU14 303- Toolchain Collaboration

● Finish upstreaming ILP32 support○ Including gdb and glibc support○ glibc patch is almost done, just finalizing the patch set

● Upstream base ThunderX support○ Will not include a schedule model to begin with

● Upstreaming patches for GCC 6 stage 1○ Conditional moves improvements○ Improvements to conditional compares○ Large system extension support in GCC

■ Joel posted an infrastructure change that was rejected; might need to rewrite them○ LSE HWCAP support in glibc and kernel

■ Need to know what path is acceptable for glibc○ Some tweaks to the cost tables in AARCH64; needed for ThunderX support

● Looking into prefetch loop arrays

Cavium - Focus from LCU14 into 2015

Page 14: LCU14 303- Toolchain Collaboration

GNU Toolchain Collaboration

Page 16: LCU14 303- Toolchain Collaboration

● Continue Member Driven Optimizationscurrent examples in development:● Zero-sign-extension elimination using value-range propagation● NEON intrinsics improvements in Libvpx on ARM & Aarch64● STREAMS performance improvements

● Identify Linaro Toolchain product driven optimizations● Benchmarking Linaro toolchain products● Identifying Regressions● Improving performance based on investigations

● Performance Comparisons● Identify potential optimizations based on performance gains seen on other

architectures.● Future

● Whole System Profiling & Workload Profiling● Feature exploitation

● LTO for Aarch64

Linaro - What’s Next for GCC Performance?

Page 17: LCU14 303- Toolchain Collaboration

● Improve NEON testing coverage and correctness● GCC community stewardship

● bug triage● patch review

● Unified Driver Development● LLVM Community Releases

Linaro - Community Involvement

Page 18: LCU14 303- Toolchain Collaboration

● Improve validation of Linaro GCC source package backports● Improve automation● Add default configurations validated per backport: 8 17

● Provide expansive source release validation of existing products● all default configurations● all enabled secondary configurations● all supported languages● various tunings

● Offer new products● arm and aarch64 native binary toolchains● x86_64 hosted cross toolchains● Aarch32 targeted cross toolchains

Linaro - What’s Next for Product Offerings?

Page 19: LCU14 303- Toolchain Collaboration

● Release Candidate Benchmarking○ Current Release Benchmarking

■ Manual SPEC2K (looking for release regressions)○ Future Release Benchmarking

■ Automated SPEC2K, SPEC2K6, EEMBC Suite● Backport Validation Benchmarking

○ Current Backport Benchmarking■ None

○ Future Backport Benchmarking■ Automated Coremark in development

● Reporting - uploading permitted relative results to members only portal● Why does Linaro do benchmarking?

○ Guides future development○ Informs validity of patches in development

■ Current Development Benchmarking● as-needed: Coremark, SPEC2K, SPEC2K6

Linaro - What’s next for Benchmarking?

Page 20: LCU14 303- Toolchain Collaboration

PUBLIC

GNU Roadmap : Cortex-AM

OBI

LE

ENTE

RPRI

SE

CO

MM

ON

2014 FutureH1 2015

Released

Development

Adv. Planning

Concept

ARMv8 A32 - ISA extension

Cortex-A12 - Arch support

A64 toolchain production ready - GCC 4.9

Cortex-A12/A17 - uArch tuning, cost model

A64 performance gains

ACLE 64 - Specification

Cortex-A57 - uArch tuning, cost model

ILP32- User space & production

ACLE 64- Implementation

Big Endian- AArch64 auto-vectorization

Maths libraries

A64 GOLD

Big Endian – Basic AArch64 support

Performance optimization - CPU-centric performance enhancements

Toolchain features - Continuous ecosystem contribution for performance and features, NEON intrinsics

GCC 4.9

GCC 4.10 / 5.0

A7/A15 A32 big.LITTLE

Page 21: LCU14 303- Toolchain Collaboration

PUBLIC

▪ Reworked AArch64 RTX costs ▪ Improved Neon intrinsics code generation▪ PUSH_ARGS_REVERSED improvements.▪ GLIBC math library improvements for AArch32 and AArch64 ▪ Improved code generation for copysign intrinsic ▪ Improved choice of spill size for FP registers (decreasing memory bandwidth)▪ Restructured and improved prologue/epilogue sequences – especially with –fomit-

frame-pointer.▪ Improved addressing modes for vectors on AArch64▪ Improve AArch32 memset inlining

What We’ve Done In the Past Three Months or SoGNU Toolchain

Page 22: LCU14 303- Toolchain Collaboration

PUBLIC

▪ General bug fixes and maintenance▪ Enable shrink-wrapping for AArch64 (GNUTOOLS-2476)▪ Investigate and initial RFC for better load store pair generation (GNUTOOLS-154)▪ Improved bit field handling instructions (GNUTOOLS-197)▪ Big Endian AArch64 fixes (Focused on SIMD and vectorisation correctness)▪ Improved Register move costs (GNUTOOLS-4528)▪ Misc performance improvements based on scheduler / backend tweaks (GNUTOOLS-4317,

GNUTOOLS-4508)▪ Improved csinc / csneg generation (GNUTOOLS-4335)▪ Conditional compares ▪ Core tuning: Cortex-A57, Cortex-A12 and Cortex-A17▪ IVOpts improvements▪ Memcpy for AArch64 – inlining and improved alignment

What’s NextGCC – Things to do before Stage 1 closes (mid-October 2014)

Page 23: LCU14 303- Toolchain Collaboration

PUBLIC

▪ Stage 3▪ Bug fixes/Regression fixes.▪ Improved conformance and performance for Advanced SIMD Intrinsics.

▪ Stage 4▪ Regression fixes.▪ Help community get GCC 5.0 released.

What’s NextGCC – During Stage 3 (October – December 2014) and Stage 4 (Early 2015)

Page 24: LCU14 303- Toolchain Collaboration

PUBLIC

▪ Maintenance▪ Support the architectural roadmap▪ Help community get Binutils 2.25 released.

What’s NextBinutils & GDB

Page 25: LCU14 303- Toolchain Collaboration

QuIC - GNU Toolchain Roadmap

Page 26: LCU14 303- Toolchain Collaboration

QuIC - GNU Toolchain Details

Page 27: LCU14 303- Toolchain Collaboration

ST - GNU Toolchain Roadmap

Page 28: LCU14 303- Toolchain Collaboration

ST - GNU Toolchain Details

Page 29: LCU14 303- Toolchain Collaboration

LLVM Collaboration

Page 31: LCU14 303- Toolchain Collaboration

● Become the compiler of choice for all Qualcomm processor cores● Today LLVM is the compiler of choice for DSP and GPU● Would like to see LLVM reach that level acceptance for CPU before the end of 2015

● Realize the full benefits of code hygiene on ARM from LLVM’s family of projects, i.e., sanitizers.

QuIC - Goals for LLVM

Page 32: LCU14 303- Toolchain Collaboration

● Collaborated with ARM on initial Aarch64 backend● Worked with the community on the ARM64/Aarch64 merge

● CortexA53 machine description● CortexA57 machine description

● Contributed initial Aarch64 ELF support to lld● ASAN bug fixes

QuIC - What has QuIC done with LLVM

Page 33: LCU14 303- Toolchain Collaboration

● Continue weekly collaborate with ARM on performance optimizations, particularly Aarch64.

● Greedy inliner● PGO● Incremental use of sanitizers

QuIC - What QuIC will be working on

Page 34: LCU14 303- Toolchain Collaboration

● Community Maintainership● LLVM 3.5 and LLVM 3.6 release maintainership

● Support● LLVM Kernel initiative, Android bugs, buildbots, member support

● LLVM Toolchain Stability● Assembler, compiler libraries, linker, tools, libc++

● LLVM Performance● Benchmarking & Profiling● Comparing against GCC/x86● Performance parity of 32-bit vs. 64-bit

● Sanitizers - might be covered under GCC development plan

● LLVM Linker● LLVM Integration on Android for Aarch64

Linaro - What’s Next For LLVM in Linaro?

current staff coverage line

Page 35: LCU14 303- Toolchain Collaboration

PUBLIC

MO

BIL

EEN

TERP

RIS

EC

OM

MO

N

2014 FutureH1 2015

Released

Development

Adv. Planning

Concept

LLVM 3.4

LLVM 3.5

LLVM 3.6

LLVM Roadmap : Cortex-A

v8 NEON - AArch64

Big Endian - Basic

Benchmarking infrastructure – Public performance tracking buildbot

L

libc++ buildbotInitial Autovectorization

L

AArch32 buildbot

Cortex-A53 - uArch tuning

L

AArch64 and Cortex-A57 - Performance tuningARM64 / AArch64 backend merge

Page 36: LCU14 303- Toolchain Collaboration

PUBLIC

▪ Completion of the ARM64 and AArch64 backend merge▪ Performance improvements:

▪ Improve code generation for converting in-memory 16-bit integer to 64-bit float (LLVM-1508)▪ Optimistically use ‘sqrt’ instruction where available, and only fall back to a library call in the

presence of NaNs (LLVM-1509)▪ Reduce spilling of Q registers (LLVM-1538)▪ Improve code selection between conditional instructions and branches. (LLVM-1489)▪ A57 Fused multiply tuning (LLVM-1610)▪ Improve Global Value Numbering (LLVM-1612)

▪ Re-engineering of ARM Neon intrinsic support▪ Big Endian Support - AArch32 & initial AArch64 support▪ Stack size reduction patches – some work still to do.

What We’ve Done In the Past Three Months or SoLLVM Toolchain

Page 37: LCU14 303- Toolchain Collaboration

PUBLIC

▪ Inline parameter tuning (LLVM-1500)▪ Improve spilling heuristics (LLVM-1524, LLVM-1504, LLVM-1586)▪ Common expression hoisting (LLVM-1247, LLVM-1490, LLVM-1550)▪ TBNZ and CBNZ optimization (LLVM-1575)▪ Register coalesce and rematerialization (LLVM-1582)▪ Redundant common comparison expressions (LLVM-1491)▪ Loop induction variable selection (LLVM-1492)▪ Remove redundant stores▪ Improved usage of vectorization opportunites using structs (LLVM-1501)▪ Reduce xzr assignment on cbz target (LLVM-1583)

What’s NextLLVM – Performance

Page 38: LCU14 303- Toolchain Collaboration

PUBLIC

▪ Global variable store should be hoisted (LLVM-1493)▪ Too many MOVs on function call boundaries (LLVM-1504)▪ Optimise LDR, LDRSW sequence into LDR, SXTW (LLVM-1581)▪ Tune loop unrolling (LLVM-1587, LLVM-1590, LLVM-1646)

What’s NextLLVM – Performance

Page 39: LCU14 303- Toolchain Collaboration

PUBLIC

▪ Buildbots & benchmarking infrastructure▪ Plan to setup a public performance tracking bot on Juno-A57▪ To be publicly visible, maintained, and continuously producing performance numbers▪ Running the LLVM LNT test-suite as a benchmark

▪ Various bug fixes and improvements▪ Focus on ARMv8-A, ARMv7, and ARMv6-M.

▪ Support for selected ACLE (non Neon) intrinsics

What’s NextLLVM - Other

Page 40: LCU14 303- Toolchain Collaboration

ST - LLVM Toolchain Roadmap

Page 41: LCU14 303- Toolchain Collaboration

ST - LLVM Toolchain Details

Page 42: LCU14 303- Toolchain Collaboration

System Libraries, Tools, Debuggers Collaboration

Page 43: LCU14 303- Toolchain Collaboration

● System Libraries● malloc benchmarking● malloc improvements● string and memory function optimizations for arm-linux-gnueabihf● Linaro GDB and glibc source package releases with backported optimizations

● GDB● Finish GDB on Android for ARMv8 support - catchpoints● Aarch32/Aarch64 completeness - test-suite parity● Aarch32 mix-mode debugging (thumb and arm modes)

Linaro - What’s next for system libs & dev tools?

Page 44: LCU14 303- Toolchain Collaboration

PUBLIC

▪ String routine improvements▪ Maintenance activities. ▪ Help community get 2.21 released.

What’s NextGlibc – up to 2.21 release (end of 2014)

Page 45: LCU14 303- Toolchain Collaboration

PUBLIC

▪ Linkers: LLD & Gold▪ Libc++▪ Sanitizers▪ ILP32

What We Are Not Currently DoingBut Are Interested In…

Page 46: LCU14 303- Toolchain Collaboration

QuIC - Libraries, Linkers, Debuggers, Tools

Page 47: LCU14 303- Toolchain Collaboration

ST - Libraries, Linkers, Debuggers, Tools

Page 48: LCU14 303- Toolchain Collaboration

More about Linaro Connect: connect.linaro.org Linaro members: www.linaro.org/membersMore about Linaro: www.linaro.org/about/