Grigori Fursin, PhD

My CV.pdf    My Collective Knowledge playground    Reproducibility initiatives    GitHub    Vision    LinkedIn    Google scholar    Medium    EMail

I am a British national living permanently in the Greater Paris area. As Head of the R&D Lab at FlexAI, I use AI to co-design more efficient and cost-effective AI systems. I am also a Founder and Architect of cKnowledge.org and cTuning.org, Former Founder and Architect of a collaborative AI benchmarking and optimization platform acquired by OctoAI (now Nvidia), Former VP of MLOps at OctoAI, Former Co-Director of the Intel Exascale Lab, Former Senior Tenured Scientist at INRIA, and Former Adjunct Professor at the University of Paris-Saclay with a PhD in Self-Optimizing Compilers and Systems from the University of Edinburgh.


Brief biography:

I am a forward-thinking and agile computer scientist, inventor, full AI/ML stack engineer, strategic advisor to startups, investors, and executive teams, educator, mentor, long-time open source contributor, and passionate advocate for open science. I hold a PhD in self-optimizing compilers and systems from the University of Edinburgh. My interdisciplinary background spans computer engineering (with expertise in co-designing the full hardware-software stack from the cloud to the edge), machine learning, AI systems, data analytics, workflow automation, knowledge management, and physics and electronics. I am passionate about building innovative solutions to real-world problems and about unifying and automating R&D processes to enhance efficiency and reduce costs.

Fascinated by the prospects of AI and robotics, I began my R&D career in the mid-1990s as an undergraduate, taking on a technical leadership role to develop Hopfield-based analog semiconductor neural networks from scratch. This included complete development and automation of software, hardware, models, and datasets for training, inference, electronic simulation, and prototyping—since nothing existed at the time.

This project took much longer than I originally expected and revealed numerous issues in R&D methodologies, tools, and inefficiencies in computer engineering. As a result, I decided to switch to computer science and pursue PhD research to address these challenges. This interdisciplinary foundation and experience enabled me to pioneer and champion visionary uses of machine learning, AI, crowd-tuning, and crowd-learning to co-design more efficient, cost-effective, and scalable computer systems—including compilers, runtimes, software, and hardware—during my PhD at the University of Edinburgh and postdoctoral research at Inria.

I initiated and led R&D efforts that addressed the growing complexity of modern systems and served as a precursor to self-optimizing and agentic systems, AutoML, workflow automation, agent-based optimization, federated learning, reproducible experimentation, and universal, efficient, technology-agnostic compute. It also enabled me to initiate and support open science and reproducibility initiatives starting in 2008, when I launched cTuning.org (followed by cKnowledge.org with my Collective Knowledge Technology aka CK in 2014) and released all my research code, data, models, and experiments for our ML-based self-optimizing compiler—considered the first of its kind (ACM TechTalk'21). I was honored to receive the ACM CGO Test of Time Award, multiple Best Paper Awards, the INRIA Award for Scientific Excellence, and the EU HiPEAC Technology Transfer Award for this research and open-source tools.

After serving as a senior tenured research scientist at INRIA, an adjunct professor at the University of Paris-Saclay, and co-director of the Intel Exascale Lab, I transitioned my research and open-source tools into industry. I first established a non-profit cTuning foundation and co-founded a successful engineering company to automatically benchmark and optimize deep learning across diverse software and hardware stacks, with a focus on mobile phones and edge devices. I helped bootstrap it as CTO and Chief Architect, quickly growing it to $1M+ in revenue with just 4 people, thanks to my CK automation technology. I then joined Entrepreneur First, a highly selective company-building program for scientists and technologists, where I learned to build lean startups and avoid common pitfalls. As a result, I founded and bootstrapped two startups in the fields of performance optimization, MLOps automation, and knowledge management—the latter of which was acquired by OctoAI (now part of NVIDIA).

During that time, I invented the Collective Mind automation language (CM/CMX), which was adopted by MLCommons—a consortium of over 100 AI and systems companies—to test and benchmark a wide range of AI models and datasets across diverse hardware and software platforms, from cloud to edge. My CM technology has automated thousands of MLPerf submissions and enabled the discovery of some of the most performance- and cost-efficient AI solutions using commodity servers, outperforming high-end systems from Nvidia. I am now developing the next generation of this automation.

At the same time, I remained actively involved in community service and open-source initiatives. I helped establish MLCommons and launch reproducibility efforts at ACM and IEEE conferences: cTuning.org/ae . I also introduced a unified artifact appendix, which has since been adopted by major conferences such as ASPLOS, CGO, PPoPP, SuperComputing and MICRO. Finally, I co-organized several successful Quantum Hackathons, including one at Ecole 42 in Paris, where we utilized my CK workflow automation and platform for collaborative benchmarking and optimization of Quantum workloads (Hackathon page and a list of my events).

Throughout my career, I’ve been honored to collaborate with and learn from brilliant minds across leading universities, non-profits, startups, and companies — including Google, Amazon, Meta, Arm, AMD, Intel, IBM, Qualcomm, NVIDIA, Raspberry Pi, OpenAI, Tesla, OctoAI, Neural Magic, Red Hat, Dell, HPE, Lenovo, Apple, INRIA, ACM, IEEE, HiPEAC, MLCommons, and the Linux Foundation: Acknowledgments (1), Acknowledgments (2), and Acknowledgments (3).

My passion lies in applying my knowledge, experience, and tools to accelerate the journey from deep tech research to real-world production—while building intelligent, self-optimizing systems. I regularly support startups, enterprises, universities, non-profits, researchers, students, and investors in rapidly prototyping novel ideas, launching innovative deep-tech projects, reducing time to market, and delivering tangible impact through collaborative, reproducible, interdisciplinary, quantifiable, and automated R&D methodologies.

While I actively prototype full-stack projects and contribute hands-on, I bring the most value in roles such as strategic advisor, technical manager, R&D lab director, educator, research scientist, or senior individual contributor. I focus on bridging research, engineering, and product teams—helping them navigate complex, rapidly evolving technology landscapes, manage project complexity, avoid common pitfalls, and achieve meaningful outcomes efficiently, even with very limited resources and time.

In 2024, I began prototyping the next generation of my Collective Knowledge platform, aimed at solving the complexity of AI systems. My goal is to develop a universal compute engine that simplifies running models on any hardware with any software in the most efficient and cost-effective way while significantly reducing all associated costs. This initiative builds on my existing work, including Collective Mind, virtualized MLOps, MLPerf, Collective Knowledge Playground, and reproducible optimization tournaments. For more details, see my white paper, and feel free to reach out if you would like to learn more. I also joined FlexAI as Head of their R&D Lab, where I am developing FlexBench to track state-of-the-art models like DeepSeek and to benchmark and optimize them across diverse software and hardware stacks. This work is based on the MLPerf methodology and my MLCommons CM workflow automation framework.

In my spare time, I enjoy spending time with my two children, reading, learning new skills, playing soccer (having competed semi-professionally), hiking, traveling, teaching, developing agentic automations and platforms for collaborative and reproducible R&D, and brainstorming future projects.

My key open-source software developments:

My key presentations and publications to help you gain insight into my projects and long-term vision:


Brief summary of my current activities:
  • Founder, President, and Chief Scientist of the cTuning.org — a non-profit educational organization and founding member of MLCommons developing open-source tools and methodologies to support reproducibility initiatives, artifact evaluation and open science in collaboration with ACM, IEEE and MLCommons since 2008. Please see Artifact Evaluation page for more details.
  • Founder and Architect of the Collective Knowledge Playground - an educational platform for learning how to co-design software and hardware to run AI, ML and other emerging workloads efficiently and cost-effectively across diverse models, datasets, software and hardware (trading off performance, power consumption, accuracy, cost and other characteristics). CK playground leverages the MLCommons CMX workflow automation framework with virtual MLOps developed in collaboration with MLCommons, cTuning.org and other organizations. Please see ArXiv white paper and an online catalog of reusable and virtual automation recipes for MLOps and DevOps.
  • Head of R&D Lab at FlexAI, coordinating efforts to leverage AI for co-designing more efficient and cost-effective AI systems.
    Core technologies used: HuggingFace models and datasets, vLLM, PyTorch, Triton, TensorRT, Nsight, MLPerf, OpenSearch, MLCommons CMX, FastAPI, Docker, Bayesian search, reinforcement learning and LLMs, Nvidia and AMD GPUs.
  • Organizer of reproducibility initiatives and artifact evaluation for AI, ML and Systems conferences and MLPerf benchmarks in collaboration with ACM, IEEE and MLCommons since 2013. I am leading the development of a common interface and automation language to make it easier to rerun and reuse code, data and experiments from published papers - see my ACM Tech Talk'21, ACM REP'23 keynote and white paper'24 for more details.
  • Member of the Program Committee at ACM Conference on Reproducibility and Replicability 2025.
Brief summary of my past activities:
  • founder and co-chair of the MLCommons Task Force on Automation and Reproducibility to modularize and automate MLPerf benchmarks using my CM framework (white paper);
  • author and tech.lead of the Collective Mind workflow automation framework (CM) adopted by MLCommons and the Autonomous Vehicle Computing Consortium (AVCC) to modularize MLPerf benchmarks and make it easier to run them across diverse models, data sets, software and hardware from different vendors using portable, reusable and technology-agnostic automation recipes (see online catalog of MLOps and MLPerf scripts and online docs to run MLPerf inference benchmarks). I donated this open-source technology to MLCommons to benefit everyone and continue developing it as a community effort. You can learn more about this project in this white paper. Since 2025, we split CM developments into an extended version of CM (CMX) and a simplified version of CM for MLPerf. I thank our great contributors for their feedback and support.
  • vice president of MLOps at OctoML where I prototyped the first version of CM and CM4MLOps together with the cTuning foundation before donating it to MLCommons to benefit everyone;
  • founder and chief architect of the virtual MLOps platform (cKnowledge.io) acquired by OctoML (now Nvidia);
  • author of the Collective Knowledge technology (CK) powering cKnowledge.io;
  • author of the Artifact Evaluation and Reproducibility checklist (Unified Artifact Appendix) for ACM/IEEE conferences (see example of my artifact appendix at the end of this ASPLOS'24 paper "PyTorch 2: Faster Machine Learning Through Dynamic Python Bytecode Transformation and Graph Compilation");
  • co-founder of a CodeReef platform for universal MLOps with Nicolas Essayan;
  • co-director of the Intel Exascale Lab and tech.lead for performance analysis, optimization and co-design of high-performance and cost-effecitve computer systems;
  • senior tenured scientist at INRIA developing the foundations to co-design more efficient and cost-effective computer systems using auto-tuning, machine-learning and run-time adaptation;
  • research associate at the University of Edinburgh;
  • holder of the PhD in computer science from the University of Edinburgh with the Overseas Research Student Award (self-optimizing compilers, run-time systems and software/hardware co-design);
  • recipient of the European technology transfer award, ACM CGO test of time award and INRIA award of scientific excellence for my original research to use AI, ML, federated learning and collective tuning (cTuning) to automate development of high-performance and cost-effective computer systems and reduce R&D costs and time to market by an order of magnitude.

Timeline: