Artifact Evaluation Experience CGO'15 / PPoPP'15

Artifact Evaluation Experience

CGO and PPoPP 2015

Bruce Childers University of Pittsburgh, USA

Grigori Fursin cTuning foundation, France

Grigori Fursin and Bruce Childers “Artifact Evaluation Experience: CGO and PPoPP 2015”

Outline

• What is Artifact Evaluation (AE)?

• Joint AE process for CGO’15 and PPoPP’15

• Two Prizes for highest-ranked artifacts from CGO and PPoPP

• Challenges

• Suggestions for future AE

Sponsors


Some issues

Article

Tools

Scripts

Hardware

Simulators

Benchmarks

Data sets Libraries

OS

Compilers

VMs

Related material

Experimental results

Databases


Some issues

Raising number of articles Where is related material?

Why bother

? • Time consuming - waste of time • Not needed for promotion • Life span – MS/PhD/project • Can cause competition


Some issues

Raising number of articles

• Difficult or even impossible to reproduce results from publications

• Demotivating to redevelop past techniques

• Little trust from industry

• Computer engineering is often considered as hacking - difficult to attract students


Some issues

Raising number of articles

• Difficult or even impossible to reproduce results from publications

• Demotivating to redevelop past techniques

• Little trust from industry

• Computer engineering is often considered as hacking - difficult to attract students

Possible solution:

• Make it sexy to share code and data

(at least to reproduce results)

• Engage with the community

Governmental funding agencies data mandates


What is Artifact Evaluation (AE)?

Authors of accepted articles has an option to submit related material for

an AE committee to be evaluated

PC members nominate one or two senior student/engineer for AE

committee

• Abstract • Packed artifact (or remote access) • ReadMe (how to validate results)


What is Artifact Evaluation (AE)?

Authors of accepted articles has an option to submit related material for

an AE committee to be evaluated

PC members nominate one or two senior student/engineer for AE

committee

• Abstract • Packed artifact (or remote access) • ReadMe (how to validate results)

~2 weeks for evaluation, at least 2 reviews per artifact, 4 days for rebuttal • Summary and contributions of the paper. •Artifact packaging and reproducibility. •Artifact implementation and usability. •Overall assessment. •On what platform/how was the artifact evaluated.

Ranking: 1. Significantly exceeded expectations 2. Exceeded expectations 3. Met expectations 4. Fell below expectations 5. Significantly fell below expectations


Joint AE process for CGO’15 and PPoPP’15

CGO/PPoPP’15 organizers:

Aaron Smith, Kunle Olukotun, Robert Hundt, Jason Mars, Chris Fensch Albert Cohen, David Grove, Calin Cascaval

Acknowledgments:

Reviewers:

David Boehme, Santiago Bock, Lingda Li, Lin Ma, Yiannis Nikolakopulos, Jeeva Paudel, Paul Thomson, Peter Libic, Dave Wilkinson, Weiwei Chen, Riyadh

Baghdadi, Na Meng, Arun Raman, Bapi Chatterjee, Martin Maas, Vojtech Horky, Vasileios Trigonakis, Mahdi Eslamimehr, Yuhao Zhu, Melanie Kambadur, Michael

Laurenzano

Related AE: Shriram Krishnamurthi

Authors: 8 submitted artifacts for CGO and 10 for PPoPP


Accepted artifacts

CGO’15

cTuning.org/event/ae-cgo2015

•Locality-Centric Thread Scheduling for Bulk-

synchronous Programming Models on CPU

Architectures

Hee-Seok Kim, Izzat El Hajj, John Stratton, Steven Lumetta

and Wen-Mei Hwu

•MemorySanitizer: fast detector of uninitialized memory

use in C++

Evgeniy Stepanov and Konstantin Serebryany

•A Parallel Abstract Interpreter for JavaScript

Kyle Dewey, Vineeth Kashyap and Ben Hardekopf

•A Graph-Based Higher-Order Intermediate

Representation

Roland Leißa, Marcel Köster and Sebastian Hack

•Optimizing the flash-RAM energy trade-off in deeply

embedded systems

James Pallister, Kerstin Eder and Simon J. Hollis

•Scalable Conditional Induction Variable (CIV) Analysis

Cosmin E. Oancea and Lawrence Rauchwerger

PPoPP’15

cTuning.org/event/ae-cgo2015

•NUMA-aware Graph-structured Analytics

Kaiyuan Zhang, Rong Chen and Haibo Chen

•Predicate RCU: An RCU for Scalable Concurrent Updates

Maya Arbel and Adam Morrison

•Scalable and Efficient Implementation of 3D Unstructured Meshes

Computation: A Case Study on Matrix Assembly

Loïc Thébault, Eric Petit and Quang Dinh

•VirtCL: A Framework for OpenCL Device Abstraction and Management

Yi-Ping You, Hen-Jung Wu, Yeh-Ning Tsai and Yen-Ting Chao

•Dynamic deadlock verification for general barrier synchronisation

Tiago Cogumbreiro, Raymond Hu, Francisco Martins and Nobuko Yoshida

•Low-Overhead Software Transactional Memory with Progress Guarantees

and Strong Semantics

Minjia Zhang, Jipeng Huang, Man Cao and Michael Bond

•The SprayList: A Scalable Relaxed Priority Queue

Justin Kopinsky, Dan Alistarh, Jerry Li and Nir Shavit

•Performance Implications of Dynamic Memory Allocators on Transactional

Memory Systems

Alexandro Baldassin, Edson Borin and Guido Araujo

•More than You Ever Wanted to Know about Synchronization

Vincent Gramoli

•Cache-Oblivious Wavefront: Improving Parallelism of Recursive DP

Algorithms without Losing Cache-efficiency

Yuan Tang, Ronghui You, Haibin Kan, Jesmin Tithi, Pramod Ganapathi and

Rezaul Chowdhury


Highest-ranked artifacts from CGO and PPoPP

1st place

2nd place

Quadro K6000 (will be shipped directly)

Acer C720P


Highest-ranked artifacts from CGO and PPoPP

1st place

2nd place

Quadro K6000 (will be shipped directly)

Acer C720P

“The SprayList: A scalable

relaxed priority queue”

Justin Kopinsky, Dan Alistarh,

Jerry Li and Nir Shavit

“A graph-based higher-order

intermediate representation”

Roland Leißa, Marcel Köster

and Sebastian Hack


Challenges

We need your feedback to improve AE!

• Should we replicate or reproduce results?

• Should we allow reviewers communicate with authors

(keep anonymity)?

• Can we slightly change experimental setups?

• What if new results invalidate paper claims?

• Do we need to be able to reinstall tools from scratch?

• Artifact consistent with a paper? Well documented? Easy to use?

Need to provide better guidelines for authors and reviewers!


Challenges

Different SW/HW

GCC 4.1.x

ICC 11.1

LLVM 2.8

OpenMP MPI OpenCL

perf ATLAS

function-level

hardware counters

pass reordering

frequency GCC 4.9.x

genetic algorithms

ARM v8

CUDA 5.x GCC 4.3.x

GCC 4.4.x

GCC 4.5.x

GCC 4.6.x ICC 11.0

ICC 12.0

LLVM 2.6

LLVM 3.x MVS 2013

XLC

HMPP

PAPI

Scalasca predictive

scheduling MKL

polyhedral transformations KNN

bandwidth

memory size

execution time

SSE4

SimpleScalar

LTO

cache size

threads

algorithm precision

Open64

Jikes

TAU

GCC 5.x

• 6 VirtualBox images (2x2Gb, 1x20Gb) do not include unrelated SW such as OpenOffice, GNOME, … • 2 VWMare images (proprietary) • 2 CDE • 1 Docker • 3 access to remote machine with preinstalled software • 4 compressed tar balls

VMs not good for performance evaluation!

Should have a large pool of qualified reviewers: should be able to install tools and know some basic script debugging


Challenges

• Accessing proprietary/paid/large benchmarks (SPEC2006, EEMBC, etc)

Authors should add some benchmarks/data sets to test their code. If the benchmarks/data sets are proprietary, please provide a couple of synthetic or

public ones

• Installing proprietary/paid/large tools such as Intel compilers and performance analysis tools

• Reinstalling large software tools with many dependencies

• Accessing non-public tools (such as large, academic and non released compilers) say just to validate 1 pass

• Getting access to a very rare and/or powerful hardware (i.e. clusters or supercomputers or hardware with specific counters such as measuring consumed energy)

• Getting anonymous access to the authors’ machines

• Requiring sole and long access to (authors’) busy machines (say for performance or energy tuning)


Some ideas

• Arrange AE server with pre-installed most commonly used software and with access to some hardware

• FPGAs

• Microcontrollers

• ARM/Qualcomm/Intel development boards

• Arrange access to various distributed machines at authors’ sites with pre-installed tools

• Arrange access to most commonly used clusters (registration will be done by AE chairs to preserve anonymity of the reviewers):

• XSEDE, PRACE, GRID5000, CINES, opensciencegrid.org

• Making a pool of good artifact evaluators - need to get at least 3 reviews per artifact (2 is not enough)

• Develop common experiential infrastructure (workflows, meta-information)?

http://github.com/ctuning/ck http://cknowledge.org/repo

Discussions with ACM about formalization / meta-description / stamp.

http://github.com/ctuning/ck

http://cknowledge.org/repo


Keep in touch

AE for CGO/PPoPP:

• Grigori Fursin, [email protected]

• Bruce Childers, [email protected]

AE for PLDI/OOSPLA

• Shriram Krishnamurthi

• Jan Vitek

• Eric Eide

Our projects:

• http://cknowledge.org/reproducibility

• http://www.occamportal.org

• http://github.com/ctuning/ck

Sponsors are welcome!

Science

Artifact Evaluation Experience CGO'15 / PPoPP'15