Home Contact CV Research / Teaching Hobbies Social
Current Research Teaching Software Publications Links
[Back to Research topics]


Iterative optimizations (iterative compilation) for different constraints (performance, code size, power consumption)

Tackling the complexity of future computing systems using machine learning:


Description     Collaborations     Software     Our related publications     Links

Description:

 

I am moving all my research and software development activities to the UNIDAPT Group. This page will give you some information about my previous activities. Also, my personal research page can give you some more information about my current activities.


    

Current innovations in science and industry demand ever-increasing computing resources while placing strict requirements on system performance, power consumption, size, response, reliability, portability and design time. Both embedded and large-scale systems tend to evolve toward complex heterogeneous multi-chip systems with dramatically increased design, test and optimization time. Optimizing compilers play a key role in producing executable codes quickly and automatically while satisfying all the above requirements for a broad range of programs and architectures. However, for many years, state-of-the-art compilers fail to deliver satisfactory levels of performance on new systems due to necessarily simplistic hardware models, fixed and black-box optimization heuristics, inability to tune application at fine-grain level, highly dynamic behavior of the system and inability to adapt to varying program and system behavior at run time with low overhead. This suggests that current system design and program optimization technologies are reaching their limits and should be revisited to keep pace with rapidly evolving hardware.

During my Ph.D. at the University of Edinburgh (1999-2004), I had been working with Prof. Michael O'Boyle to introduce iterative compilation at a fine-grain level (function, loop or instruction) to automatically find best optimization settings for large applications (rather than kernels) on rapidly evolving architectures that beat state-of-the art optimizing compilers. I had also been working with Prof. Olivier Temam to develop a fast and accurate technique to determine lower bound of the execution time of memory intensive applications by replacing all array accesses with scalars to have a stopping criterion for iterative compilation. Later, we developed a new concept to combine static and dynamic optimizations by creating self-tuning static binaries based on static code multiversioning and run-time hardware counters monitoring routines to learn run-time program behavior continuously, detect changes due to different program phases, inputs, environments (OS, architecture, virtual layers) and to dynamically react to those changes by selecting appropriately optimized code versions to improve performance, power, fault-tolerance, etc. We used this method to speed up iterative compilation by 3 orders of magnitude using a run-time low-overhead program phase detection scheme and function versioning. At the same time, I started developing an Interactive Compilation Interface (ICI) for Open64/PathScale compilers and GCC to enable continuous run-time adaptation and optimization of statically compiled programs at fine-grain level. We also developed a novel technique to characterize programs or architectures using program reaction to optimizations (transformations). I had been collaborating with my colleagues from the University of Edinburgh, UK to introduce this technique as well as statistical search and machine learning to enable optimization knowledge reuse among different programs and architectures using static and dynamic program and architecture features.

I currently work as a tenured research scientist at INRIA Saclay, France and use my research results in several EU-funded projects (HiPEAC, MilePost, SARC, GGCC and others) to move towards my long-term goal to develop and generalize automatic continuous program optimization and parallelization techniques and architecture design exploration using innovative search methods, adaptation, machine learning and knowledge reuse. This should enable realistic intelligent self-tuning systems, particularly in the presence of rapidly evolving multi-core heterogeneous architectures. I am developing publicly available software tools for GCC and Open64 compilers. I collaborate with my colleagues from the University of Edinburgh, INRIA, Ghent University, ICT, IBM, ARC, CAPS Enterprise, NXP, STMicro and others, and open to new contacts, collaborations and proposals to realize these goals.

I founded UNIDAPT Group to pioneer and promote radically new practical techniques to overcome the complexity of current and future architectures, compilers, operating systems and programming environments. We are working on creating self-tuning intelligent adaptive systems, automating and improving architecture and compiler design (particularly for future heterogeneous multi-core systems), optimizing and parallelizing applications to improve performance, power consumption, size and fault-tolerance, reduce cost and time to market based on machine learning, artificial intelligence, statistical methods and biologically inspired techniques. This is critical to be able to continue innovation in science and industry (bioinformatics, medicine, physics, chemistry, finances, gaming, etc) that demands ever-increasing computing resources while placing strict requirements on systems.

Together with the MILEPOST colleagues we are currently developing MILEPOST GCC - the first machine learning based self-tuning research compiler described more in our GCC Summit'08 paper (many thanks to Cupertino Miranda for extending Interactive Compilation Interface, Edwin Bonilla for extending machine learning techniques, Mircea Namolaru for integrating static program feature extractor to MILEPOST GCC and John Thomson for performance evaluation). We plan to release MILEPOST GCC v1.0 by the end of 2008. In the mean time, you can register your interest here to receive information about (pre)releases of the MILEPOST GCC. You can also subscribe for UNIDAPT announcements or contact me if you would like to use it for your research and collaborate on some projects.

Tuning hardwired compiler optimizations for rapidly evolving hardware makes porting an optimizing compiler for each new platform extremely challenging. Our radical approach is to develop a modular, extensible, self-tuning intelligent compiler that automatically learns the best optimization heuristics based on combining feedback-directed iterative compilation and machine learning. MILEPOST GCC is a machine learning based compiler that automatically adjusts its optimization heuristics to improve execution time, code size, or compilation time of specific programs on different architectures. Currently, we use several iterative search strategies within CCC framework to find combinations of good optimization flags to substitute GCC default optimization levels for a particular architecture (such as -O0,-O1,-O2,-O3,-Os which we will not need in the future adaptive compilers) or tune optimization passes on a function-level for a particular program. Our preliminary experimental results show that it is possible to considerably reduce execution time of various benchmarks (MiBench, MediaBench, EEMBC, SPEC) on a range of platforms (x86, x8664, IA64, ARC, Longson/Godson, etc) entirely automatically. MILEPOST GCC can be used interactively in research on adaptive computing through the Interactive Compilation Interface (we currently support optimization pass selection and reordering, plugins and event mechanism to invoke any passes on demand (such as program feature extraction pass) or tune cost-models of specific transformations within passes).

Software: Collaborations:


More Information about my software developements



 

Our related publications:


  • Grigori Fursin and Olivier Temam. Collective optimization. To appear at the International Conference on High Performance Embedded Architectures & Compilers (HiPEAC 2009), Paphos, Cyprus, January 2009

  • Grigori Fursin, Cupertino Miranda, Olivier Temam, Mircea Namolaru, Elad Yom-Tov, Ayal Zaks, Bilha Mendelson, Phil Barnard, Elton Ashton, Eric Courtois, Francois Bodin, Edwin Bonilla, John Thomson, Hugh Leather, Chris Williams, Michael O'Boyle. MILEPOST GCC: machine learning based research compiler. Proceedings of the GCC Developers' Summit, Ottawa, Canada, June 2008

    [bib]     [pdf]
  • Veerle Desmet, Grigori Fursin, Sylvain Girbal and Olivier Temam. Leveraging Modular Simulation for Systematic Design Space Exploration. 4th HiPEAC Industrial Workshop on Compilers and Architectures organized by ARM Ltd., Cambridge, UK, November 2007

    [bib]
  • Grigori Fursin, Cupertino Miranda, Sebastian Pop, Albert Cohen and Olivier Temam. Practical Run-time Adaptation with Procedure Cloning to Enable Continuous Collective Compilation. Proceedings of the GCC Developers’ Summit, Ottawa, Canada, July 2007

    [bib]     [pdf]
  • Grigori Fursin and Albert Cohen. Building a Practical Iterative Interactive Compiler. 1st International Workshop on Statistical and Machine Learning Approaches Applied to Architectures and Compilation (SMART'07), Ghent, Belgium, January 2007

[bib]    [pdf]

  • John Cavazos, Grigori Fursin, Felix Agakov, Edwin Bonilla, Michael F.P.O’Boyle and Olivier Temam. Rapidly Selecting Good Compiler Optimizations using Performance Counters. Proceedings of the 5th  Annual International Symposium on Code Generation and Optimization (CGO), San Jose, USA, March 2007

    [bib]    [pdf]
  • Grigori Fursin, John Cavazos, Michael O’Boyle and Olivier Temam. MiDataSets: Creating The Conditions For A More Realistic Evaluation of Iterative Optimization. Proceedings of the International Conference on High Performance Embedded Architectures & Compilers (HiPEAC 2007), Ghent, Belgium, January 2007

[bib]    [pdf]

  • Grigori Fursin, Albert Cohen, Michael O'Boyle and Oliver Temam. Quick and practical run-time evaluation of multiple program optimizations. Transactions on High-Performance Embedded Architectures and Compilers, 1(1), pages 13-31, 2006

[bib]    [pdf]   [pdf  backup]

  • Shun Long and Grigori Fursin. Systematic search within an optimisation space based on Unified Transformation Framework. Accepted for publication in the special issue of the International Journal of Computational Science and Engineering (IJCSE)

[bib]    [pdf]

  • F. Agakov, E. Bonilla, J. Cavazos, B. Franke, G. Fursin, M.F.P. O'Boyle, J. Thomson, M. Toussaint and C.K.I. Williams. Using Machine Learning to Focus Iterative Optimization. Proceedings of the 4th  Annual International Symposium on Code Generation and Optimization (CGO), New York, NY, USA, March 2006
    (best presentation award)
[bib]    [pdf]
  • Grigori Fursin, Albert Cohen, Michael O'Boyle and Oliver Temam. A Practical Method For Quickly Evaluating Program Optimizations. Proceedings of the 1st International Conference on High Performance Embedded Architectures & Compilers (HiPEAC 2005), number 3793 in LNCS, pages 29-46, Barcelona, Spain, November 2005
    (highest ranked paper, acceptance rate=18%)

[bib]    [pdf]   [pdf  backup]

  • B. Franke, M. O'Boyle, J. Thomson and G. Fursin. Probabilistic Source-Level Optimisation of Embedded Systems Software. Proceedings of the Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES’05), pages 78-86, Chicago, IL, USA, June 2005
[bib]    [pdf]
  • Shun Long and Grigori Fursin. A heuristic search algorithm based on Unified Transformation Framework. Proceedings of the 7th International Workshop on High Performance Scientific and Engineering Computing (HPSEC-05), pages 137-144, Oslo, Norway, June 2005

[bib]    [pdf]

  • Grigori Fursin. Iterative Compilation and Performance Prediction for Numerical Applications. Ph.D. thesis, University of Edinburgh, Edinburgh, UK, January 2004
[bib]    [pdf]   [pdf  backup]
  • G.G.Fursin, M.F.P.O’Boyle, and P.M.W. Knijnenburg. Evaluating Iterative Compilation. Proceedings of the 15th Workshop on Languages and Compilers for Parallel Computing (LCPC’02), College Park, MD, USA, pages 305-315, 2002
[bib]    [pdf]
  • Abella, J., S. A. Ali Touati, A. Anderson, C. Ciuraneta, J. M. Codina, Min Dai,    C. Eisenbeis, G. Fursin, A. Gonzalez, J. Llosa, M. O'Boyle, A. Randrianatoavina,   J. Sanchez, O. Temam, X. Vera, and G. Watts. MHAOTEU Tools for Memory Hierarchy Management. IMACS’2000, 16th IMACS World Congress on Scientific Computation, Applied Mathematics and Simulation, Lausanne, Switzerland, August 2000

[bib]    [pdf]


All my publications      My bibliography collection       My links to on-line journals and conference


Links:

  • PathOpt and PathOpt2 (tool for automated application tuning) - free tool within commercial open-source compiler from QLogic (PathScale) to iteratively search for the best compiler flags. We switched to this tool in 2003 and currently use it for our research since it is easily configurable, works with any languages, does not require project modifications, has many basic search strategies (random, one by one, all but one), easily extendable. We easily coupled it with the PAPI library to obtain hardware counters and retargeted it to work with GCC and Intel Compilers
  • ESTO ( Expert System for Tuning Applications)- iterative optimization tool from IBM to find best compiler flags with genetic algorithms
  • Acovea (Analysis of Compiler Options via Evolutionary Algorithm) - iterative optimization tool to find best compiler flags for C and C++ programs with genetic algorithms
  • PAPI library - performance application programming interface
  • OProfile - a system profiler for Linux
  • Papiex - command line PAPI measurement tool

If I forgot some group, please don't hesitate to contact me.



Content & Design (C)opyright 1999-2008 by Grigori Fursin
(unless otherwise stated)