<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>gem5</title>
    <description>The official repository for the gem5 website.</description>
    <link>https://www.gem5.org//</link>
    <atom:link href="https://www.gem5.org//feed.xml" rel="self" type="application/rss+xml" />
    <pubDate>Fri, 05 Jun 2026 09:06:10 +0000</pubDate>
    <lastBuildDate>Fri, 05 Jun 2026 09:06:10 +0000</lastBuildDate>
    <generator>Jekyll v3.10.0</generator>
    
      <item>
        <title>ISCA 2025: Toward Full-System Heterogeneous Simulation: Merging gem5-SALAM with Mainline gem5</title>
        <description>&lt;h1 id=&quot;towards-full-system-heterogeneous-simulation-in-gem5&quot;&gt;Towards Full-System Heterogeneous Simulation in gem5&lt;/h1&gt;

&lt;p&gt;As SoC architectures grow increasingly heterogeneous, they now integrate not only CPUs and GPUs but also tightly coupled programmable accelerators tailored for specific workloads. These accelerators are critical for emerging domains such as mobile inference, AR/VR, real-time vision, and edge analytics. Unlike traditional CPU-GPU systems, modern heterogeneous platforms demand fine-grained coordination among diverse compute engines, shared memory subsystems, and software-managed execution models. Capturing these interactions requires a cycle-level, full-system simulator.&lt;/p&gt;

&lt;p&gt;While gem5 has long supported detailed CPU simulation and, more recently, full-system GPU modeling, support for programmable accelerators remained external via tools like gem5-SALAM—built on gem5 v21.1. Although SALAM added accelerator-specific capabilities such as cycle-level datapath modeling, memory-mapped scratchpads, and hardware synthesis integration, it was isolated from the mainline. As a result, it could not leverage recent ISA, memory system, or configuration infrastructure updates, nor benefit from upstream validation.&lt;/p&gt;

&lt;p&gt;To close this gap, we integrated SALAM’s accelerator infrastructure into gem5 mainline (develop branch v25). This unification elevates accelerators to first-class components alongside CPUs and GPUs, enabling full-system heterogeneous simulation under a single software stack. The result is a unified framework for modeling heterogeneous SoCs with realistic OS support, shared resource contention, and software-controlled task orchestration.&lt;/p&gt;

&lt;h2 id=&quot;integration-at-a-glance&quot;&gt;Integration at a Glance&lt;/h2&gt;

&lt;p&gt;We integrated SALAM’s accelerator modeling infrastructure into gem5-develop through a series of architectural, interface, and validation updates.&lt;/p&gt;

&lt;p&gt;We began by integrating key accelerator modeling components from SALAM into gem5. These include the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LLVMInterface&lt;/code&gt;, which executes LLVM IR kernels using a cycle-accurate datapath; the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;CommInterface&lt;/code&gt;, which provides software-visible control and interrupt signaling; and a suite of configurable memory components such as scratchpads, DMA engines, and stream buffers. Together, these elements enable detailed and flexible modeling of a wide range of accelerator microarchitectures and memory hierarchies. To support realistic SoC integration, accelerators and local memories can be grouped into an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AccCluster&lt;/code&gt;, reflecting the modular structure of accelerator subsystems commonly found in commercial SoCs. For rapid prototyping, we also integrated and automated SALAM’s hardware profile generator, which converts user-defined timing specifications into executable datapath models – eliminating the need for manual microarchitectural implementation. Finally, we refactored CACTI-SALAM for compatibility with gem5’s infrastructure, enabling timing and energy estimation for scratchpad memories using CACTI’s file-based configuration methodology. These changes bring cycle-level accelerator modeling, full-system memory interaction, and scalable design space exploration into gem5 mainline.&lt;/p&gt;

&lt;p&gt;We then updated SALAM’s accelerator infrastructure to match gem5’s latest design conventions. This included refactoring classes to use modern SimObject patterns, replacing unsafe pointer casts in LLVM instruction handling with type-safe 32-bit variables, and switching to gem5’s standardized random number generator for latency modeling. We fixed off-by-one errors in address range definitions to follow gem5’s inclusive-exclusive semantics, aligned environment and ISA configuration with gem5’s current setup, and added dynamic LLVM detection using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;llvm-config&lt;/code&gt; to simplify SCons-based compilation for datapath simulation.&lt;/p&gt;

&lt;p&gt;Finally, we validated the integrated framework by ensuring it passed gem5’s pre-commit checks and full regression test suite. Additionally, we adapted SALAM’s original system validation tests to run within the unified environment and cross-validated the outputs against the original SALAM baseline to confirm functional equivalence. We plan to upstream these accelerator tests to the gem5-resources repository to support broader validation of the integrated SALAM components within gem5.&lt;/p&gt;

&lt;h2 id=&quot;what-this-enables&quot;&gt;What This Enables&lt;/h2&gt;

&lt;h3 id=&quot;broader-heterogeneity-studies&quot;&gt;Broader Heterogeneity Studies&lt;/h3&gt;

&lt;p&gt;With accelerators now fully integrated into gem5 mainline, researchers can simulate complete heterogeneous systems comprising CPUs, GPUs, and custom accelerators—co-existing under a single OS kernel and sharing interconnects and memory. This allows detailed studies of performance interference, resource arbitration, and synchronization mechanisms across diverse compute engines, grounded in full-system behavior rather than simplified models.&lt;/p&gt;

&lt;h3 id=&quot;system-level-exploration&quot;&gt;System-Level Exploration&lt;/h3&gt;

&lt;p&gt;The framework supports rich exploration of architectural tradeoffs at the system level. Users can evaluate different memory organizations—such as private scratchpads, shared LLCs, or DMA-managed SPMs—and compare strategies for offloading, synchronization, and kernel placement. Static vs. dynamic scheduling, locality-aware memory partitioning, and software-managed DMA schemes can all be studied in realistic OS-driven settings.&lt;/p&gt;

&lt;h3 id=&quot;domain-specific-workload-support&quot;&gt;Domain-Specific Workload Support&lt;/h3&gt;

&lt;p&gt;This infrastructure also enables architectural research targeting emerging domains like real-time vision, mobile inference, AR/VR, and edge computing. These applications demand predictable latency, software-accelerator coordination, and careful memory management. The integrated framework allows researchers to model and study these workloads using real software stacks and bootable Linux images, with accelerator behavior simulated at cycle-level fidelity.&lt;/p&gt;

&lt;h3 id=&quot;exploratory-studies-in-non-traditional-regimes&quot;&gt;Exploratory Studies in Non-Traditional Regimes&lt;/h3&gt;

&lt;p&gt;Finally, the toolchain enables exploration of accelerator operation under emerging regimes such as transient overclocking and advanced cooling. In our workshop paper, we use this framework to study one such case of a non-traditional operating regime: multi-GHz frequency scaling in accelerators, enabled by advanced cooling techniques such as immersion and cryogenic systems. We present a preliminary analysis of performance and power upper bounds across this range. The results show how system bottlenecks shift with increasing frequency, highlighting the importance of evaluating accelerator behavior in the context of host latency and memory interactions. Full details of the experimental setup and findings are included in our ISCA ’25 workshop paper.&lt;/p&gt;

&lt;p&gt;Users can apply this framework to the use cases discussed above using built-in accelerator models and benchmarks, or extend it further by modeling their own custom accelerators.&lt;/p&gt;

&lt;h2 id=&quot;modeling-your-own-accelerator&quot;&gt;Modeling Your Own Accelerator&lt;/h2&gt;

&lt;p&gt;Creating a new accelerator model in the integrated gem5 framework is simple. You begin by writing the desired accelerator algorithm in C/C++ and compiling to LLVM IR.  A YAML-based hardware profile specifies instruction timing, functional unit latencies, and memory ports. This profile is processed by the hardware-profile generator to produce a cycle-level timing model.&lt;/p&gt;

&lt;p&gt;The user then places the accelerator inside an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AccCluster&lt;/code&gt;, attaches scratchpads or DMAs as needed, and configures the system topology using gem5’s Python interface. A host-side program running in the simulated OS coordinates with the accelerator via memory-mapped control registers and interrupts. The complete system is simulated using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;run_system.sh&lt;/code&gt;, producing statistics, optional power reports, and host-side console output.&lt;/p&gt;

&lt;h2 id=&quot;getting-started&quot;&gt;Getting Started&lt;/h2&gt;

&lt;p&gt;To get started, set the following environment variables to your gem5 and benchmark root directories:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nb&quot;&gt;export &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;M5_PATH&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;/path/to/gem5
&lt;span class=&quot;nb&quot;&gt;export &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;ACC_BENCH_PATH&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;/path/to/benchmarks
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Clone and build gem5:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;git clone https://github.com/akanksha-sc/gem5
&lt;span class=&quot;nb&quot;&gt;cd &lt;/span&gt;gem5
scons build/ARM/gem5.opt &lt;span class=&quot;nt&quot;&gt;-j&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;nproc&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;To generate a custom hardware profile (optional):&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nv&quot;&gt;$M5_PATH&lt;/span&gt;/tools/hw_generator/HWProfileGenerator.py &lt;span class=&quot;nt&quot;&gt;-b&lt;/span&gt; &amp;lt;benchmark_name&amp;gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;To run CACTI-SALAM (optional energy/area estimation):&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nb&quot;&gt;cd&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$M5_PATH&lt;/span&gt;/tools/cacti-SALAM
./run_cacti_salam.py &lt;span class=&quot;nt&quot;&gt;--bench-list&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$ACC_BENCH_PATH&lt;/span&gt;/benchmarks.list
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Run a benchmark (custom or built-in like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bfs&lt;/code&gt;):&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nv&quot;&gt;$M5_PATH&lt;/span&gt;/tools/run_system.sh &lt;span class=&quot;nt&quot;&gt;--bench&lt;/span&gt; &amp;lt;benchmark_name&amp;gt; &lt;span class=&quot;nt&quot;&gt;--bench-path&lt;/span&gt; &amp;lt;benchmark_path&amp;gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This boots Linux, launches a user-space driver, and simulates the accelerator. Outputs include &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;stats.txt&lt;/code&gt; (performance counters), &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;system.terminal&lt;/code&gt; (host console output), &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SALAM_power.csv&lt;/code&gt; (power/area estimates, if CACTI-SALAM is used). Additional examples and documentation included in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;src/hwacc/docs&lt;/code&gt;.&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;This integration positions gem5 as a unified, full-system simulator for heterogeneous SoCs—combining CPUs, GPUs, and programmable accelerators under one framework with realistic timing, software, and architectural detail. It opens the door to studies ranging from co-scheduling and memory-system tuning to high-frequency accelerator and advanced-cooling analyses. Next steps include merging the support into gem5 mainline, expanding the benchmark suite with domain-specific workloads, and extending full-system accelerator support to additional ISAs. We hope this foundation accelerates heterogeneous-system research across the community.&lt;/p&gt;

&lt;h2 id=&quot;acknowledgments&quot;&gt;Acknowledgments&lt;/h2&gt;

&lt;p&gt;This work is supported in part by the Semiconductor Research Corporation and by the DOE’s Office of Science, Office of Advanced Scientific Computing Research through EXPRESS: 2023 Exploratory Research for Extreme Scale Science.&lt;/p&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;A. Chaudhari and M. D. Sinclair. “Toward Full-System Heterogeneous Simulation: Merging gem5-SALAM with Mainline gem5.” 6th gem5 Users’ Workshop, June 2025.&lt;/li&gt;
  &lt;li&gt;S. Rogers, J. Slycord, M. Baharani and H. Tabkhi, “gem5-SALAM: A System Architecture for LLVM-based Accelerator Modeling,” 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), Athens, Greece, 2020, pp. 471-482, doi: 10.1109/MICRO50266.2020.00047.&lt;/li&gt;
&lt;/ul&gt;
</description>
        <pubDate>Wed, 30 Jul 2025 00:00:00 +0000</pubDate>
        <link>https://www.gem5.org//2025/07/30/gem5AccHetSimBlog.html</link>
        <guid isPermaLink="true">https://www.gem5.org//2025/07/30/gem5AccHetSimBlog.html</guid>
        
        
      </item>
    
      <item>
        <title>Running Bao Hypervisor on gem5</title>
        <description>&lt;p&gt;&lt;sup&gt;1&lt;/sup&gt; Ritsumeikan University &lt;br /&gt;&lt;/p&gt;

&lt;p&gt;This article presents a methodology for executing the Bao Hypervisor [1] on gem5 via the VExpress_GEM5_Foundation platform, enabling executions of two unmodified Linux virtual machines simultaneously.&lt;/p&gt;

&lt;p&gt;It was tested on “x86_64 Ubuntu 22.04”.&lt;/p&gt;

&lt;h2 id=&quot;obtain-and-build-software&quot;&gt;Obtain and Build Software&lt;/h2&gt;

&lt;p&gt;Install the tools required to run Bao Hypervisor on gem5.&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;apt update
&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;apt &lt;span class=&quot;nb&quot;&gt;install &lt;/span&gt;git tree scons gcc g++ python3-dev &lt;span class=&quot;se&quot;&gt;\ &lt;/span&gt;
make automake device-tree-compiler &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
bison flex libssl-dev
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Create a directory to store build outputs:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nb&quot;&gt;mkdir&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-p&lt;/span&gt; resources/binaries resources/semihosting
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;download-and-setup-the-toolchain&quot;&gt;Download and setup the toolchain&lt;/h3&gt;

&lt;p&gt;Download the latest bare-metal cross-compile toolchain “&lt;strong&gt;aarch64-none-elf&lt;/strong&gt;” from the &lt;a href=&quot;https://developer.arm.com/downloads/-/arm-gnu-toolchain-downloads&quot;&gt;Arm Developer’s&lt;/a&gt; website.
Tested on “arm-gnu-toolchain-13.3.rel1-x86_64-aarch64-none-elf.&lt;/p&gt;

&lt;p&gt;Once the cross compiler for aarch64 has been downloaded, the file should be extracted.&lt;br /&gt;
Then, define the cross compiler prefix to the &lt;strong&gt;CROSS_COMPILE&lt;/strong&gt; environment variable.&lt;/p&gt;

&lt;p&gt;The following example demonstrates the installation of “arm-gnu-toolchain-13.3.rel1-x86_64-aarch64-none-elf.tar.xz”.&lt;br /&gt;
When setting the environmental variable, be careful not to forget a “&lt;strong&gt;-&lt;/strong&gt;” at the end.&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# Downloaded &quot;arm-gnu-toolchain-13.3.rel1-x86_64-aarch64-none-elf.tar.xz&quot; &lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# to current directory, then following command should be executed.&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;tar &lt;/span&gt;xvf arm-gnu-toolchain-13.3.rel1-x86_64-aarch64-none-elf.tar.xz
&lt;span class=&quot;nb&quot;&gt;export &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;CROSS_COMPILE&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;pwd&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;/arm-gnu-toolchain-13.3.rel1-x86_64-aarch64-none-elf/bin/aarch64-none-elf-
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;set-up-base-environment-and-build-automation&quot;&gt;Set up base environment and build automation&lt;/h3&gt;

&lt;p&gt;Obtain the git repository for building Bao Hypervisor and Linux. Then, build it using the make command.&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;git clone https://github.com/h1demasa/bao-demos.git
&lt;span class=&quot;nb&quot;&gt;cd &lt;/span&gt;bao-demos
make &lt;span class=&quot;nt&quot;&gt;-j&lt;/span&gt; &lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;nproc&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Once the build is completed, copy the generated binary to the directory that gem5 will use.&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;
&lt;span class=&quot;nb&quot;&gt;cp &lt;/span&gt;wrkdir/imgs/fvp-a/fip.bin ../resources/binaries/fip.bin
&lt;span class=&quot;nb&quot;&gt;cp &lt;/span&gt;wrkdir/imgs/fvp-a/bl1.bin ../resources/binaries/bl1.bin
&lt;span class=&quot;nb&quot;&gt;cp &lt;/span&gt;wrkdir/imgs/fvp-a/linux+linux/bao.bin ../resources/semihosting/bao.bin
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Before proceeding, return to the parent directory and check that your resources directory has the following structure:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nb&quot;&gt;cd&lt;/span&gt; ../
tree resources/
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;resources
    binaries
        bl1.bin
        fip.bin
    semihosting
        bao.bin
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;gem5-simulator&quot;&gt;gem5 (simulator)&lt;/h3&gt;

&lt;p&gt;To run Bao Hypervisor with gem5, build gem5.&lt;/p&gt;

&lt;p&gt;As the Performance Monitors Control Register is not implemented in gem5, numerous warning logs are generated when Bao is running. 
These logs affect gem5’s execution performance. 
To prevent these warning logs, apply the following patch to gem5.&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;git clone https://github.com/gem5/gem5.git &lt;span class=&quot;nt&quot;&gt;-b&lt;/span&gt; v24.0.0.1
&lt;span class=&quot;nb&quot;&gt;cd &lt;/span&gt;gem5
git apply ../bao-demos/platforms/gem5/gem5.patch
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Build gem5 with the above changes:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;scons &lt;span class=&quot;nt&quot;&gt;-j&lt;/span&gt; &lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;nproc&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt; build/ARM/gem5.opt
&lt;span class=&quot;nb&quot;&gt;cd&lt;/span&gt; -
make &lt;span class=&quot;nt&quot;&gt;-C&lt;/span&gt; gem5/util/term
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;run-and-connect-to-the-simulation&quot;&gt;Run and Connect to the Simulation&lt;/h2&gt;

&lt;p&gt;Now you can run the simulation:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nb&quot;&gt;export &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;M5_PATH&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;resources/ &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
	gem5/build/ARM/gem5.opt gem5/configs/example/arm/baremetal.py &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
	   &lt;span class=&quot;nt&quot;&gt;--workload&lt;/span&gt; ArmTrustedFirmware &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
	   &lt;span class=&quot;nt&quot;&gt;--num-cores&lt;/span&gt; 4 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
	   &lt;span class=&quot;nt&quot;&gt;--mem-size&lt;/span&gt; 4GB &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
	   &lt;span class=&quot;nt&quot;&gt;--machine-type&lt;/span&gt; VExpress_GEM5_Foundation &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
	   &lt;span class=&quot;nt&quot;&gt;--semi-enable&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--semi-path&lt;/span&gt; resources/semihosting
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And connect to it:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;gem5/util/term/m5term 3456 &lt;span class=&quot;c&quot;&gt;# Trusted Firmware-A, U-boot and Bao Hypervisor&lt;/span&gt;
gem5/util/term/m5term 3457 &lt;span class=&quot;c&quot;&gt;# VM 1 (Linux)&lt;/span&gt;
gem5/util/term/m5term 3458 &lt;span class=&quot;c&quot;&gt;# VM 2 (Linux)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It is recommended that you observe the various stages of the bootloader and the text indicating the Bao Hypervisor boot process, which can be accessed via port 3456. 
Similarly, the Linux kernel boot sequence can be observed via port 3457 and port 3458. 
Once the boot process is completed, A login shell is started. 
You can access it via ports 3457 and 3458. 
The username and password for this login are “root”.&lt;/p&gt;

&lt;p&gt;In my environment, it took an hour for the Linux login shell to start up.&lt;br /&gt;
Please wait while enjoying a cup of coffee.&lt;/p&gt;

&lt;h2 id=&quot;demo&quot;&gt;Demo&lt;/h2&gt;
&lt;p&gt;This is a demo video of running Bao Hypervisor on gem5.&lt;/p&gt;
&lt;iframe width=&quot;560&quot; height=&quot;560&quot; src=&quot;https://www.youtube.com/embed/pbHlcX5WVOE&quot; frameborder=&quot;0&quot; allow=&quot;accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;

&lt;hr /&gt;

&lt;p&gt;[1] José Martins, Adriano Tavares, Marco Solieri, Marko Bertogna, and Sandro Pinto. “&lt;strong&gt;Bao: A Lightweight Static Partitioning Hypervisor for Modern Multi-Core Embedded Systems&lt;/strong&gt;”. In Workshop on Next Generation Real-Time Embedded Systems (NG-RES 2020). Schloss Dagstuhl-Leibniz-Zentrum für Informatik. 2020. &lt;a href=&quot;https://drops.dagstuhl.de/opus/volltexte/2020/11779/&quot;&gt;https://drops.dagstuhl.de/opus/volltexte/2020/11779/&lt;/a&gt;&lt;/p&gt;
</description>
        <pubDate>Tue, 12 Nov 2024 00:00:00 +0000</pubDate>
        <link>https://www.gem5.org//2024/11/12/bao-on-gem5.html</link>
        <guid isPermaLink="true">https://www.gem5.org//2024/11/12/bao-on-gem5.html</guid>
        
        
      </item>
    
      <item>
        <title>gem5 Version 23.1 Release: A Leap Forward in Computer Architecture Simulation</title>
        <description>&lt;p&gt;The gem5 computer architecture simulation tool has released its latest milestone - Version 23.1.
This release marks a significant step in gem5’s development as it transitions to GitHub and introduces several groundbreaking features and improvements.
In this blog post, we’ll dive into the key changes and enhancements that come with gem5 v23.1.&lt;/p&gt;

&lt;h2 id=&quot;a-shift-to-github-development&quot;&gt;A Shift to GitHub Development&lt;/h2&gt;

&lt;p&gt;One of the most notable changes in gem5 v23.1 is the transition to GitHub for development.
This shift to a popular platform for collaborative software development fosters greater transparency and community involvement.
During this release, an impressive 362 pull requests were merged, comprising 416 commits from 51 unique contributors.
This level of community engagement reflects gem5’s commitment to open-source collaboration.&lt;/p&gt;

&lt;h2 id=&quot;streamlined-configuration-with-kconfig&quot;&gt;Streamlined Configuration with Kconfig&lt;/h2&gt;

&lt;p&gt;The gem5 simulator’s build configuration has undergone a transformation with the introduction of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kconfig&lt;/code&gt;.
While most gem5 builds remain backward-compatible, specialized builds now require the use of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;scons &amp;lt;kconfig command&amp;gt;&lt;/code&gt;.
This change allows users to tailor their gem5 installations to specific needs and is detailed in the &lt;a href=&quot;https://www.gem5.org/documentation/general_docs/kconfig_build_system/&quot;&gt;kconfig documentation&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;standard-library-improvements&quot;&gt;Standard Library Improvements&lt;/h2&gt;

&lt;p&gt;The gem5 simulator’s v23.1 release introduces several standard library improvements, including the introduction of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;WorkloadResource&lt;/code&gt; to resource specialization.
The older &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Workload&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;CustomWorkload&lt;/code&gt; classes have been deprecated in favor of the new &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;obtain_resource&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;WorkloadResource&lt;/code&gt; classes in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;resource.py&lt;/code&gt;. Users can easily update their code by following the provided migration instructions.&lt;/p&gt;

&lt;p&gt;Moreover, gem5 introduces “Suites”, a new category of resource.
These suites enhance the versatility and functionality of gem5 simulations, opening up new possibilities for users.&lt;/p&gt;

&lt;h2 id=&quot;api-changes-and-enhancements&quot;&gt;API Changes and Enhancements&lt;/h2&gt;

&lt;p&gt;Version 23.1 brings several changes to the API, making it more informative and flexible.
Resource objects now have their own &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;id&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;category&lt;/code&gt;, and the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__str__()&lt;/code&gt; function provides detailed information.
Users can utilize environment variables like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;GEM5_RESOURCE_JSON&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;GEM5_RESOURCE_JSON_APPEND&lt;/code&gt; to customize data sources using JSON files, offering more control over resource management.&lt;/p&gt;

&lt;h2 id=&quot;support-for-latest-tools-and-platforms&quot;&gt;Support for Latest Tools and Platforms&lt;/h2&gt;

&lt;p&gt;Version 23.1 of gem5 ensures compatibility with the latest tools and platforms, with added support for clang 15 and clang 16.
However, it’s important to note that gem5 no longer supports building on Ubuntu 18.04, and GCC 7, GCC 9, and clang 6 are no longer supported.&lt;/p&gt;

&lt;h2 id=&quot;full-system-gpu-model-enhancements&quot;&gt;Full-System GPU Model Enhancements&lt;/h2&gt;

&lt;p&gt;This release significantly enhances the full-system GPU model.
Users can now leverage gem5 to simulate the latest ROCm 5.7.1, and various updates enable PyTorch and TensorFlow simulations.
The inclusion of a new packer disk image script containing ROCm 5.4.2, PyTorch 2.0.1, and TensorFlow 2.11 opens up exciting possibilities for GPU simulation.&lt;/p&gt;

&lt;h2 id=&quot;risc-v-rvv-10-implementation&quot;&gt;RISC-V RVV 1.0 Implementation&lt;/h2&gt;

&lt;p&gt;A major achievement in gem5 v23.1 is the implementation of RISC-V RVV 1.0.
This feature, the result of collaborative efforts from numerous contributors, brings most of the instructions in the 1.0 specification to life.
RVV 1.0 is compatible with various CPU models, supports both FS and SE modes, and allows users to specify vector unit widths.
While there is room for future improvements, this implementation is a significant step forward for RISC-V support in gem5.&lt;/p&gt;

&lt;h2 id=&quot;armisa-changes-and-improvements&quot;&gt;ArmISA Changes and Improvements&lt;/h2&gt;

&lt;p&gt;Version 23.1 of gem5 introduces architectural support for several Arm extensions, enhancing SVE instruction support, addressing FEAT_SEL2 related issues, and implementing an Arm Capstone Disassembler.
These changes and improvements further solidify gem5’s compatibility with Arm-based systems.&lt;/p&gt;

&lt;h2 id=&quot;other-notable-changes&quot;&gt;Other Notable Changes&lt;/h2&gt;

&lt;p&gt;Several other notable changes and improvements have been made, including:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Coherence protocol enhancements&lt;/li&gt;
  &lt;li&gt;Far atomics in CHI&lt;/li&gt;
  &lt;li&gt;Improved support for RISC-V privilege modes&lt;/li&gt;
  &lt;li&gt;Bug fixes and optimizations&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;known-bugs-and-issues&quot;&gt;Known Bugs and Issues&lt;/h2&gt;

&lt;p&gt;While gem5 v23.1 brings significant improvements, there are some known bugs and issues that the development team is actively addressing.
The gem5 community encourages users to report any issues they encounter to help improve the tool’s reliability.&lt;/p&gt;

&lt;p&gt;In conclusion, gem5 Version 23.1 is a remarkable release that reflects the project’s continued commitment to excellence, collaboration, and innovation in the field of computer architecture simulation.
As gem5 transitions to GitHub, it opens new doors for community involvement and further advancements.
Researchers, developers, and enthusiasts alike can look forward to an enhanced simulation experience with gem5 v23.1.&lt;/p&gt;

&lt;p&gt;To get started with gem5 v23.1 and explore its new features, visit the &lt;a href=&quot;https://www.gem5.org/&quot;&gt;official gem5 website&lt;/a&gt; and consult the documentation.
Join the gem5 community, provide feedback, and contribute to the evolution of this powerful simulation tool.
Your engagement is essential in shaping the future of computer architecture research and development.&lt;/p&gt;
</description>
        <pubDate>Thu, 21 Dec 2023 00:00:00 +0000</pubDate>
        <link>https://www.gem5.org//project/2023/12/21/gem5-23-1.html</link>
        <guid isPermaLink="true">https://www.gem5.org//project/2023/12/21/gem5-23-1.html</guid>
        
        
        <category>project</category>
        
      </item>
    
      <item>
        <title>Benchmarking Linkers within gem5</title>
        <description>&lt;p&gt;&lt;strong&gt;tl;dr&lt;/strong&gt;: Use the &lt;a href=&quot;https://github.com/rui314/mold&quot;&gt;mold linker&lt;/a&gt; for the fastest linking times when building gem5&lt;/p&gt;

&lt;p&gt;People familiar with gem5 are aware of its lengthy compilation time, especially during the linking stage.
This can become frustrating when even a minor edit necessitates re-linking previously compiled files.
This can add several minutes to the process.
Keeping this in mind, we evaluated a range of currently supported linkers with gem5 to determine which one is the most efficient.&lt;/p&gt;

&lt;p&gt;To conduct these tests, we closely examined each of the linkers that gem5 currently supports including the current default linker, “ld”.
The four additional supported linkers evaluated are “lld”, “bfd”, “gold”, and “mold”.&lt;/p&gt;

&lt;p&gt;Our method for comparing these linkers was as follows: we first built gem5 normally by executing &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;scons build/ALL/gem5.opt&lt;/code&gt;.
Once the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gem5.opt&lt;/code&gt; binary was compiled, we deleted it.
Thus we were left with the compiled object files but no linked binary.
Then, we compared runs of rebuilding/linking gem5 with each of the linkers using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/usr/bin/time scons build/ALL/gem5.opt -j12 --linker=[linker-option]&lt;/code&gt;, where &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/usr/bin/time&lt;/code&gt; measured the duration.
We ran these tests on a system using an AMD EPYC 7402P 24-Core processor with a 3.35 GHz frequency.&lt;/p&gt;

&lt;p&gt;During these tests, we observed that using a linker other than the default “ld” in our experimental setup forced a recompilation of all of the m5 files.
However, if we deleted &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gem5.opt&lt;/code&gt; binary after this and ran the compilation again, only &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gem5.opt&lt;/code&gt; was rebuilt/linked, resulting in two distinct times.
To compare the times, we needed to take into account the time it took for the first run to build all the m5 files, as well as the time it took for the second run to re-link gem5.opt.
These are labeled below as “all m5” versus “last few”.
In addition to these two runs, we also compared each run on a networked file system (NFS) and a local SSD to see if the storage type of the files had any impact on the run times.
Finally, we performed one last run on the local SSD using the 48 available cores on our system to assess if it made any difference.
Below, we present the elapsed time for each of these runs.&lt;/p&gt;

&lt;p&gt;We found that among the four linkers we tested, “bfd” was the slowest, and “mold” was the fastest.
Additionally, the difference between using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-j12&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-j48&lt;/code&gt; appeared to be insignificant.&lt;/p&gt;

&lt;p&gt;Based on our results, we suggest using “mold” as the linker when working with gem5.
It’s worth noting that using a particular linker had a more significant impact on the time taken than the storage location.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th style=&quot;text-align: center&quot;&gt; &lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;NFS + all m5&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;NFS + last few&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;Local SSD + all m5&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;Local SSD + last few&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;Local SSD + -j48 + last few&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;ld&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;—&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;3:29.19&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;—&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;3:08.31&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;3:00.15&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;bfd&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;4:15.82&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;3:32.13&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;3:39.70&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;3:02.15&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;3:02.35&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;lld&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;2:16.22&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;1:54.25&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;1:52.94&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;1:13.12&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;1:13.16&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;gold&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;2:30.98&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;1:43.59&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;1:59.41&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;1:19.86&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;1:19.48&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;mold&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;1:48.62&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;1:07.08&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;1:08.18&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;0:28.23&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;0:27.89&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;In addition to just comparing the build times, we executed 100000000 ticks for each linked compilation to ensure that using these linkers wouldn’t cause any issues when actually using gem5, such as increased execution time or functional problems.&lt;/p&gt;

&lt;p&gt;We achieved this by performing an x86 linux boot with an O3 CPU and a Ruby cache. The command to do so is provided below.&lt;/p&gt;

&lt;p&gt;Command:&lt;/p&gt;

&lt;div class=&quot;language-sh highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;/usr/bin/time build/ALL/gem5.opt &lt;span class=&quot;nt&quot;&gt;-re&lt;/span&gt; tests/gem5/configs/x86_boot_exit_run.py &lt;span class=&quot;nt&quot;&gt;--cpu&lt;/span&gt; o3 &lt;span class=&quot;nt&quot;&gt;--num-cpus&lt;/span&gt; 2 &lt;span class=&quot;nt&quot;&gt;--mem-system&lt;/span&gt; mesi_two_level &lt;span class=&quot;nt&quot;&gt;--dram-class&lt;/span&gt; DualChannelDDR4_2400 &lt;span class=&quot;nt&quot;&gt;--boot-type&lt;/span&gt; init &lt;span class=&quot;nt&quot;&gt;--resource-directory&lt;/span&gt; tests/gem5/resources &lt;span class=&quot;nt&quot;&gt;--tick-exit&lt;/span&gt; 100000000
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We found none of the linkers had a significant impact on the runtime of any tests and all tests completed successfully.
This indicates that using linkers should not have any detrimental effects on experiments conducted within gem5.&lt;/p&gt;

&lt;p&gt;Based on our findings, we can confidently recommend using the &lt;a href=&quot;https://github.com/rui314/mold&quot;&gt;mold linker&lt;/a&gt; to speed up linking times when building gem5.
If you’re interested in using mold, you can follow the instructions &lt;a href=&quot;https://github.com/rui314/mold#how-to-build&quot;&gt;here&lt;/a&gt; to compile it.
Once it’s properly installed, you can use it by passing &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--linker=mold&lt;/code&gt; while building gem5.&lt;/p&gt;

&lt;p&gt;Here’s an example command: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;scons build/ALL/gem5.opt -j12 --linker=mold&lt;/code&gt;.&lt;/p&gt;
</description>
        <pubDate>Thu, 16 Feb 2023 00:00:00 +0000</pubDate>
        <link>https://www.gem5.org//project/2023/02/16/benchmarking-linkers.html</link>
        <guid isPermaLink="true">https://www.gem5.org//project/2023/02/16/benchmarking-linkers.html</guid>
        
        
        <category>project</category>
        
      </item>
    
      <item>
        <title>“Moving to full system simulation of GPU applications”</title>
        <description>&lt;p&gt;For over a decade gem5 has supported two modes of simulation: full system (FS) mode where the simulator uses a disk image and a kernel to boot an instance of Linux and run applications on the disk image and System-emulation (SE) mode where the simulator runs applications on the host machine and intercepts system calls and provides emulation for them.
Up until a few years ago, binaries run in SE mode were required to be statically linked to be run.
Dynamically linked binaries can now be run assuming the dynamic libraries are available on the host machine.
For increasingly more complicated and specialized applications, such as GPU applications, the dynamic libraries may not be available on the host system or may be a different version than what is needed for the simulated application.
In these cases, using FS mode is preferred.&lt;/p&gt;

&lt;p&gt;This issue of unavailable or different version libraries also occurs with the GPU model in gem5.
The GPU model currently runs in SE mode and requires an older version of &lt;a href=&quot;https://www.amd.com/en/graphics/servers-solutions-rocm&quot;&gt;AMD’s ROCm™ stack&lt;/a&gt;.
This is problematic for a few reasons: (1) the user may not have a GPU and therefore does not need the ROCm™ stack installed locally (2) the user may not be on a system compatible with the ROCm™ installer to install the libraries or (3) the user has ROCm™ installed but does not have the specific version required for gem5.
These issues are currently solved by running building and running gem5 using a docker image.
This is not necessary when using FS mode, making GPUFS easier to run along with regular simulation.&lt;/p&gt;

&lt;p&gt;Over the past two years, work has been being done to implement a simulated GPU device with all of the necessary components to communicate with the upstream GPU driver.
With this work in place, it is now possible to use FS mode to simulate GPU applications.
With the 22.1 release of gem5, we are announcing GPU FS mode (GPUFS) as the preferred method to simulate GPU applications and will eventually replace SE mode entirely.
Based on our most recent testing, nearly all applications which worked in SE mode will work in FS mode.
The remainder of this blog post discusses the use cases of FS mode along with additional benefits and known issues.&lt;/p&gt;

&lt;h1 id=&quot;use-cases&quot;&gt;Use cases&lt;/h1&gt;
&lt;p&gt;The use cases for GPUFS are the same as SE mode GPU simulation.
That is, we simulate a single GPU application, collect stats, and exit the simulation.
Although GPUFS in theory provides the ability to do more advanced simulations such as simulating concurrent GPU applications or simulating multiple GPU devices, these are &lt;em&gt;not&lt;/em&gt; supported in the current model.&lt;/p&gt;

&lt;h1 id=&quot;benefits-of-full-system&quot;&gt;Benefits of Full System&lt;/h1&gt;
&lt;p&gt;The primary benefit of GPUFS is avoiding issues with dynamic libraries.
Currently SE mode GPU simulations need to be run within a docker image.
This itself has many user facing complexities, such as environments (e.g., universities) which do not allow docker to be run and potentially different build directories for GPU simulation and non-GPU simulation, and many developer complexities including testing and keeping up with download locations of older out-of-date libraries.&lt;/p&gt;

&lt;p&gt;With a simulated GPU device users will be able to fast-forward through memory copies in GPU applications.
A basic GPU application has three main GPU-related library calls: (1) copy data to the GPU, (2) launch a kernel on the GPU, and (3) copy data to the host.
On a real system data can be copied using a GPU kernel which reads from host memory and writes to device memory or by using the help of a DMA engine.
With GPUFS, system DMA engines are implemented to copy data to/from GPU memory.
These engines can be simulated functionally within gem5 to speed up simulation.
As a result, users can copy several GBs of data to GPU memory in minutes of simulation time by avoiding the detailed simulation of copy kernels.&lt;/p&gt;

&lt;p&gt;GPUFS also more easily allows users and developers to update the ROCm™ stack to the latest version with each gem5 release.
This allows users to be able to use features of the latest ROCm™ stack which can mean less time spent backporting applications to older ROCm™ versions.
GPUFS has currently been tested on ROCm 4.2, 4.3, 5.0, and 5.4 but any version above 4.0 should work.
This testing was done on the core ROCm package only, so 1st party libraries (rocBLAS, rocFFT, rocSPARSE, etc.) have not been thoroughly tested.&lt;/p&gt;

&lt;p&gt;Full system mode uses the full ROCm stack, including the kernel driver, rather than using the emulated driver developed for SE mode.
This means users can modify the Linux kernel driver to research areas that are difficult to do in SE mode.
Examples include virtual memory research such as utilizing flexible page sizes and exploring page fault handling, implementing new packet types for the new SDMA and PM4 processors, or using virtualization features.&lt;/p&gt;

&lt;h1 id=&quot;using-full-system&quot;&gt;Using full system&lt;/h1&gt;
&lt;p&gt;Like FS mode in general, users need a disk image and a kernel to run GPUFS.
A packer script is provided in the gem5-resources repository under &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;src/gpu-fs/disk-image&lt;/code&gt;.
Additionally, the kernel is available for download or can be transferred out of the disk image.
This disk image and kernel consist of an operating system and kernel version compatible with the official ROCm™ release notes.
A prebuilt &lt;a href=&quot;http://dist.gem5.org/dist/v22-1/images/x86/ubuntu-18-04/x86-gpu-fs-20220512.img.gz&quot;&gt;GPUFS disk image&lt;/a&gt; and &lt;a href=&quot;http://dist.gem5.org/dist/v22-1/kernels/x86/static/vmlinux-5.4.0-105-generic&quot;&gt;GPUFS kernel&lt;/a&gt; are available for download.&lt;/p&gt;

&lt;p&gt;Scripts are provided for the user which can take a GPU application as an argument and copy it into the disk image upon simulation start to run a GPU application without needing to modify or mount the disk image.
The traditional “rcS” script approach can also be used to run applications which exist on the binary already, applications which may need further input files, or applications the user wishes to build from source files in the disk image.
Applications can be built using a docker image provided at gcr.io/gem5-test/gpu-fs:v22-1 or building a local docker image using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;util/dockerfiles/gpu-fs/&lt;/code&gt; in the gem5 repository.
Using this docker allows users to build GPU applications without needing to install ROCm™ on their host machine and without wasting simulation time building source files on the disk image.
If desired, users may also install the required ROCm™ version locally, even without an AMD GPU, and build applications on their host machine.&lt;/p&gt;

&lt;p&gt;More information on how to setup GPUFS is provided in the README.md file in the gem5-resources repository at &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;src/gpu-fs/README.md&lt;/code&gt;.&lt;/p&gt;

&lt;h1 id=&quot;known-issues&quot;&gt;Known issues&lt;/h1&gt;
&lt;p&gt;There are some known issues that are actively being addressed which will not be completed until a future release after gem5 22.1.
These issues are below.
If you are using GPUFS and run into an issue that is not listed here, we encourage you to report the issue to gem5-users, JIRA, or the gem5 slack channel.
A useful bug report will include both terminal output and gem5 output preferably with the following debug flags: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--debug-flags=AMDGPUDevice,SDMAEngine,PM4PacketProcessor,HSAPacketProcessor,GPUCommandProc&lt;/code&gt;.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Currently KVM and X86 are required to run full system.  Atomic and Timing CPUs are not yet compatible with the disconnected Ruby network required for GPUFS and is a work in progress.&lt;/li&gt;
  &lt;li&gt;Some memory accesses generate incorrect addresses causing hard page faults leading to simulation panics.  This is currently being investigated with high priority.&lt;/li&gt;
  &lt;li&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;printf&lt;/code&gt; function does not work within GPU kernels.  As a workaround, a gem5-specific print function is being developed.&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;recap&quot;&gt;Recap&lt;/h1&gt;
&lt;p&gt;Full system GPU simulation (GPUFS) is now the preferred method to run GPU applications in gem5 22.1+.
GPUFS is intended to be used for the same use cases are SE mode GPU simulation.
It has the benefits of avoiding simulation within docker, improved simulation speed by functionally simulating memory copies, and an easier update path for gem5 developers.&lt;/p&gt;

&lt;p&gt;As users move to GPUFS, we expect there will be some bug reports.
Users are encouraged to submit reports to the gem5-users mailing list, JIRA, or gem5 slack channel.&lt;/p&gt;
</description>
        <pubDate>Mon, 13 Feb 2023 00:00:00 +0000</pubDate>
        <link>https://www.gem5.org//2023/02/13/moving-to-full-system-gpu.html</link>
        <guid isPermaLink="true">https://www.gem5.org//2023/02/13/moving-to-full-system-gpu.html</guid>
        
        
      </item>
    
      <item>
        <title>gem5-22.1 Released!</title>
        <description>&lt;p&gt;The latter half of 2022 saw 500 commits submitted to gem5, from 48 unique contributors&lt;/p&gt;

&lt;p&gt;Our top 10 contributors for the v22.1 release were:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Bobby R. Bruce (143 commits)&lt;/li&gt;
  &lt;li&gt;Giacomo Travaglini (63 commits)&lt;/li&gt;
  &lt;li&gt;Gabe Black (53 commits)&lt;/li&gt;
  &lt;li&gt;Matthew Poremba (49 commits)&lt;/li&gt;
  &lt;li&gt;Jason Lowe-Power (19 commits)&lt;/li&gt;
  &lt;li&gt;Yu-hsin Wang (18 commits)&lt;/li&gt;
  &lt;li&gt;Zhantong Qiu (11 commits)&lt;/li&gt;
  &lt;li&gt;Earl Ou (10 commits)&lt;/li&gt;
  &lt;li&gt;Tiago Mück (9 commits)&lt;/li&gt;
  &lt;li&gt;Quentin Forcioli (9 commits)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We wish to thank all of the gem5 community who made this gem5 release possible.
We look forward to your continued support for the upcoming v23.0 release.&lt;/p&gt;

&lt;p&gt;The latest version of gem5 can be obtained by pulling the latest version of the gem5 git repo &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;stable&lt;/code&gt; branch:&lt;/p&gt;

&lt;div class=&quot;language-shell highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;git clone https://gem5.googlesource.com/public/gem5

&lt;span class=&quot;c&quot;&gt;# or, within the gem5 repo&lt;/span&gt;

git switch stable
git pull
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;major-new-features&quot;&gt;Major New Features&lt;/h2&gt;

&lt;p&gt;Several new features were included in the v22.1 release.
Here we outline some key highlights and how to use them.
For a more comprehensive list of contributions, please consult the &lt;a href=&quot;https://gem5.googlesource.com/public/gem5/+/refs/tags/v22.1.0.0/RELEASE-NOTES.md&quot;&gt;Release Notes&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&quot;multiple-isa-compilations&quot;&gt;Multiple ISA Compilations&lt;/h3&gt;

&lt;p&gt;Prior to v22.1 users of gem5 could only compile a single ISA target into a gem5 binary.
E.g., &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;scons build/X86/gem5.opt&lt;/code&gt; would create a gem5 binary with the X86 ISA but no other.
In gem5 v22.1 users can compile any combination of ISAs into a gem5 binary.
The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;build_opts&lt;/code&gt; directory includes &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ALL&lt;/code&gt; which can be used to compile a binary containing all ISA targets with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;scons build/ALL/gem5.opt&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;With multi-ISA binaries the ISA used in simulation specified via which core is used.
For example, using an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;X86TimingSimpleCPU&lt;/code&gt; will use the X86 ISA.
When using the standard library API, you can carry out simulations as before by specifying the ISA to setting up the system processor.&lt;/p&gt;

&lt;p&gt;Below is a simple ARM program in SE mode:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.isas&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ISA&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.utils.requires&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;requires&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.resources.resource&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Resource&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.components.memory&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SingleChannelDDR3_1600&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.components.processors.cpu_types&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CPUTypes&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.components.boards.simple_board&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SimpleBoard&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.components.cachehierarchies.classic.no_cache&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NoCache&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.components.processors.simple_processor&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SimpleProcessor&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.simulate.simulator&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Simulator&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# This ensures that the ARM ISA is compiled into the binary.
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;requires&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;isa_required&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ISA&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ARM&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;cache_hierarchy&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NoCache&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;memory&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SingleChannelDDR3_1600&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;32MB&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# Here we must specify the ISA we are using via the `isa` parameter.
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;processor&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SimpleProcessor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cpu_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CPUTypes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;TIMING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;isa&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ISA&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ARM&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;num_cores&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;board&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SimpleBoard&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;clk_freq&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;3GHz&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;processor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;processor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;memory&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;memory&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;cache_hierarchy&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cache_hierarchy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;board&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;set_se_binary_workload&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Resource&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;arm-hello64-static&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;simulator&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Simulator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;board&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;board&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;simulator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This can be run with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;build/ALL/gem5.opt&lt;/code&gt; and ARM ISA will automatically be used as it was specified via the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;isa&lt;/code&gt; parameter when setting &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SimpleProcessor&lt;/code&gt;.&lt;/p&gt;

&lt;h3 id=&quot;api-to-set-tick-based-exit-events&quot;&gt;API to set tick-based Exit Events&lt;/h3&gt;

&lt;p&gt;It is common when running gem5 to want the simulation to Exit at a particular simulation tick.
The typical manner in which this is achieved is by moving the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;MAX_TICK&lt;/code&gt; value of the simulation but this is limiting as it only allows for a single tick to be specified as en exit event.
Until v22.1, the API for specifying exits at others ticks was not well exposed and difficult to use.
As such the following functions have been added to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;m5&lt;/code&gt; Python module:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;setMaxTick(tick)&lt;/code&gt; : Used to to specify the maximum simulation tick.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;getMaxTick()&lt;/code&gt; : Used to obtain the maximum simulation tick value.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;getTicksUntilMax()&lt;/code&gt;: Used to get the number of ticks remaining until the maximum tick is reached.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;scheduleTickExitFromCurrent(tick)&lt;/code&gt; : Used to schedule an exit exit event a specified number of ticks in the future.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;scheduleTickExitAbsolute(tick)&lt;/code&gt; : Used to schedule an exit event as a specified tick.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;setMaxTick&lt;/code&gt; function provides a cleaner interface for setting the maximum tick the simulation is to run to.
When reached a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;MAX_TICK&lt;/code&gt; exit event is returned by the simulator.
The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;scheduleTickExit&lt;/code&gt; functions allow for the scheduling of any number of tick exit events.
When reached a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SCHEDULED_TICK&lt;/code&gt; exit event is returned by the simulator.
The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;scheduleTickExitFromCurrent&lt;/code&gt; function schedules the exit event N ticks in the future, with N being provided by the user.
The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;scheduleTickExitAbsolute&lt;/code&gt; function allows for the scheduling at a specific simulation tick (e.g., at Tick 1,000).&lt;/p&gt;

&lt;p&gt;Below is a simple simulation showing the scheduling of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SCHEDULED_TICK&lt;/code&gt; exit events at 100, 1000, 10000, and 100000 ticks in the future.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.resources.resource&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Resource&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.isas&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ISA&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.components.memory&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SingleChannelDDR3_1600&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.components.boards.simple_board&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SimpleBoard&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.components.cachehierarchies.classic.no_cache&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NoCache&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.components.processors.simple_processor&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SimpleProcessor&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.components.processors.cpu_types&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CPUTypes&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.simulate.simulator&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Simulator&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.simulate.exit_event&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ExitEvent&lt;/span&gt;

&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;m5&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;board&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SimpleBoard&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;clk_freq&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;3GHz&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;processor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SimpleProcessor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;cpu_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CPUTypes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;TIMING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;isa&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ISA&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;X86&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;num_cores&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;memory&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SingleChannelDDR3_1600&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;cache_hierarchy&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;NoCache&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;board&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;set_se_binary_workload&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Resource&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;x86-hello64-static&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;scheduled_tick_generator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Exiting at: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;m5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;curTick&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;yield&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;


&lt;span class=&quot;n&quot;&gt;simulator&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Simulator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;board&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;board&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;on_exit_event&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ExitEvent&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SCHEDULED_TICK&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;scheduled_tick_generator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()},&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;m5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;scheduleTickExitFromCurrent&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;m5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;scheduleTickExitFromCurrent&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;m5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;scheduleTickExitFromCurrent&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;m5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;scheduleTickExitFromCurrent&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;simulator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In this script the Simulator’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;on_exit_event&lt;/code&gt; parameter is utilized to handle the exit event by printing the tick number at exit, then continuing the simulation.
The output of this simulation will be:&lt;/p&gt;

&lt;div class=&quot;language-shell highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Exiting at: 100
Exiting at: 1000
Exiting at: 10000
Exiting at: 100000
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;the-riscvmatchedboard&quot;&gt;The RISCVMatchedBoard&lt;/h3&gt;

&lt;p&gt;For some time there has been a desire to distribute SimObjects, stdlib components, and systems with “known-good” properties.
That is, configurations capable of running simulations with reasonable fidelity to their real-world counter parts.
While still early days, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RISCVMatchedBoard&lt;/code&gt; is our first step into this endeavor.&lt;/p&gt;

&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RISCVMatchedBoard&lt;/code&gt; is based on SiFive’s &lt;a href=&quot;https://www.sifive.com/boards/hifive-unmatched&quot;&gt;HiFive Unmatched board&lt;/a&gt;: a RISC-V, 64 bit Linux development platform with a SiFive Freedom U740 multi-core processor.
Research at UC Davis sat down with the HiFive Unmatched board and carefully benchmarked its properties, then translated this to a gem5 design.
The design has been incorporated into the gem5 Standard Library and can be found at “src/python/gem5/prebuilt/riscvmatched/riscvmatched_board.py”.&lt;/p&gt;

&lt;p&gt;Below is an example of using the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RISCVMatchedBoard&lt;/code&gt; to run a Full-System simulation.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.prebuilt.riscvmatched.riscvmatched_board&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RISCVMatchedBoard&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.utils.requires&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;requires&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.isas&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ISA&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.simulate.simulator&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Simulator&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.resources.workload&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Workload&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;requires&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;isa_required&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ISA&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RISCV&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;board&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RISCVMatchedBoard&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;clk_freq&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;1.2GHz&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;l2_size&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;2MB&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;is_fs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;workload&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Workload&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;riscv-ubuntu-20.04-boot&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;board&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;set_workload&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;workload&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;simulator&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Simulator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;board&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;board&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;simulator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Work is still on-going to fine tune the parameters of this board to have it more-so emulate the real-world board’s behavior and details will be published in the future to highlight our overall success.
We will continue to develop “known-good” systems for gem5 and incorporate them as part of future gem5 releases.&lt;/p&gt;

&lt;h3 id=&quot;simpoints&quot;&gt;SimPoints&lt;/h3&gt;

&lt;p&gt;An API for &lt;a href=&quot;https://doi.org/10.1145/885651.781076&quot;&gt;SimPoints&lt;/a&gt; has been added.
SimPointing is a technique to substantially improve simulation time.
It works by only sampling representative parts of a simulation then extrapolating statistical data accordingly.
By doing this remaining parts of a simulated program can be skipped.&lt;/p&gt;

&lt;p&gt;In combination with gem5 Workloads (see the &lt;a href=&quot;#workloads&quot;&gt;section below&lt;/a&gt; for more information on Workloads) we can distribute binaries with SimPoint information and gem5 Checkpoints.
These can then be executed via the SimPoint API, thus producing faster simulation runs.&lt;/p&gt;

&lt;p&gt;Examples of using SimPoints with gem5 can be found in “configs/example/gem5_library/checkpoints/simpoints-se-checkpoint.py” and “configs/example/gem5_library/checkpoints/simpoints-se-restore.py”.&lt;/p&gt;

&lt;p&gt;We are going to continue to expand and fine-tune this API.
Of note, we are going to incorporate &lt;a href=&quot;https://doi.org/10.1109/HPCA53966.2022.00051&quot;&gt;LoopPoints&lt;/a&gt; into the gem5 framework, which will allow for SimPoints to work in multi-core simulations (a current limitation of the SimPoints framework).&lt;/p&gt;

&lt;h3 id=&quot;workloads&quot;&gt;Workloads&lt;/h3&gt;

&lt;p&gt;As an expansion of the gem5-resources infrastructure, the concept of a  &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Workload&lt;/code&gt; has been introduced.
The gem5-resources infrastructure has, for several major releases, provides resources.
As an example:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.resources.resource&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Resource&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;image&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Resource&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;x86-npb&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;kernel&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Resource&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;x86-linux-kernel-5.4.49&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;binary&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Resource&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;x86-print-this&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;board&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;set_se_simpoint_workload&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;binary&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;binary&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;arguments&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;hello&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1500&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Here we three resources are requested.
“x86-npb” is an X86 disk image containing the NAS Parallel Benchmark suite built atop the Ubuntu operating system, “x86-linux-kernel-5.4.49” is the v5.4.49 Linux Kernel, and “x86-print-this” is a binary which accepts two arguments: a string to be printed, and the number of times to print it”.
The board is then set to run the “x86-print-this” binary with the arguments “hello” and “1500”.&lt;/p&gt;

&lt;p&gt;While powerful, obtaining resources in this manner has some limitations.
First of all, running a simulation may require multiple resources to be maintained.
Some resources will almost always require other resources to run.
For example, our “x86-npb” disk image resource is useless without a kernel.&lt;/p&gt;

&lt;p&gt;The other issue beyond obvious couplings of resources is resources may require particular parameters to be passed to be useful.
For example, the “x86-npb” contains a suite of benchmark applications but specific command line parameters must be passed to specify what benchmark with what input is to be run.
In efforts to simplify usage of gem5, we want users to simply specify what they want their simulated system to run.
For example, “x86-npb-FS-input-A”.&lt;/p&gt;

&lt;p&gt;The solution to these problems are Workloads.
Workloads allow for the bundling of resources and any input parameters.
User’s need only specify the workload they wish to run and the gem5 Standard Library, interfacing, with the gem5-resources infrastructure, will setup the simulation to run correctly.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.prebuilt.demo.x86_demo_board&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;X86DemoBoard&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.resources.workload&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Workload&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.simulate.simulator&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Simulator&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;board&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;X86DemoBoard&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;board&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;set_workload&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Workload&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;x86-ubuntu-18.04-boot&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;simulator&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Simulator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;board&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;board&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;simulator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;“kernel” : “x86-linux-kernel-5.4.49”,
                “disk_image”:”x86-ubuntu-18.04-img”&lt;/p&gt;

&lt;p&gt;Below we show using the “x86-ubuntu-18.04-boot” workload.
This workload will use the “x86-linux-kernel-5.4.49” resource for the simulation kernel and the “x86-ubuntu-18.04-img” resource for the disk image.
Upon boot completion the simulation will exit.&lt;/p&gt;

&lt;p&gt;Another example would be the “x86-print-this-15000-with-simpoints” Workload.
This specifies an SE workload running the “x86-print-this” binary resource, passing the parameters “print this” and “1500” with the “x86-print-this-1500-simpoints” simpoint resource.&lt;/p&gt;

&lt;p&gt;At the time of writing the following Workloads are available to use:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;“x86-ubuntu-18.04-boot” : Runs an X86 Ubuntu 18.04 boot.&lt;/li&gt;
  &lt;li&gt;“riscv-ubuntu-20.04-boot” : Runs an RISC-V Ubuntu 20.04 boot.&lt;/li&gt;
  &lt;li&gt;“arm64-ubuntu-20.04-boot” : Runs an ARM-64 Ubuntu 20.04 boot.&lt;/li&gt;
  &lt;li&gt;“x86-print-this-15000-with-simpoints” : Runs the “print-this” binary, print 15000 “print this” string to the terminal, with SimPoints.&lt;/li&gt;
  &lt;li&gt;“x86-print-this-15000-with-simpoints-and-checkpoint” : Runs the “print-this” binary, print 15000 “print this” string to the terminal, with SimPoints and checkpoints.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;More Workloads will be added overtime to provide the gem5 community with a rich variety of workloads they may plug into their simulations.&lt;/p&gt;

&lt;h3 id=&quot;pre-commit-checks-for-developers&quot;&gt;pre-commit checks for developers&lt;/h3&gt;

&lt;p&gt;In order to help gem5 developers adhere to the gem5 style guide, we have added the &lt;a href=&quot;https://pre-commit.com&quot;&gt;pre-commit framework&lt;/a&gt; to the gem5 repository.
The pre-commit framework manages git hooks for the repository.
We use this to run a suite of checks on code users wish to submit to code-review via a hook run when the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;git commit&lt;/code&gt; command is executed.&lt;/p&gt;

&lt;p&gt;In particular, we now utilize the &lt;a href=&quot;https://github.com/psf/black&quot;&gt;black Python code formatter&lt;/a&gt; on all Python code submitted to the gem5 repository.&lt;/p&gt;

&lt;p&gt;In v22.1, when compiling gem5, the user will be asked if they wish to install the pre-commit hook if not already installed.
With consent from the user, the hook will be installed.
If a user wishes to install the hook manually they may do so by running the following in the gem5 repository:&lt;/p&gt;

&lt;div class=&quot;language-shell highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;pip &lt;span class=&quot;nb&quot;&gt;install&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-r&lt;/span&gt; requirements.txt
./util/pre-commit-install.sh
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Once installed, any staged code will be processed when &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;git commit&lt;/code&gt; is executed.
Users may also run the pre-commit on any file or directory they wish by using the following command:&lt;/p&gt;

&lt;div class=&quot;language-shell highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;pre-commit run &lt;span class=&quot;nt&quot;&gt;--files&lt;/span&gt; path/to/file/or/directory
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;We strongly advise users to install the pre-commit hooks&lt;/strong&gt;.
Our code-review pre-submit CI tests have been updated with the pre-commit tests.
If they fail, users will be unable to incorporate their patches to the code-base until the code is refactored so the pre-commit tests pass.&lt;/p&gt;

&lt;h3 id=&quot;gpu-full-system-mode&quot;&gt;GPU Full-System mode&lt;/h3&gt;

&lt;p&gt;In v21.2 we introduced GPU support with Syscall-Emulation (SE) mode and in v22.0 we laid the early framework for Full-System (FS) Support support.
In v22.1 we’re happy to announce substantially improved support for GPU simulation in FS mode.&lt;/p&gt;

&lt;p&gt;Until v22.1, SE mode was preferred due to its relative stability.
However, GPU SE simulation needed the user to have a very specific environment on their host system, primarily due to the GPU simulator requiring a very specific ROCm software stack for dynamic linking to workloads.
A Docker container was created to provide this environment this, naturally, required all simulations to be carried out within the container.&lt;/p&gt;

&lt;p&gt;The full incorporation of FS mode for GPU simulation removes all host requirements and all SE-mode simulations may be run as an FS mode simulation.&lt;/p&gt;

&lt;p&gt;The GPU FS mode has also has improved simulated speed by functionally simulating memory copies, and provides an easier update path for gem5 developers.&lt;/p&gt;

&lt;p&gt;For the meantime, we strongly advise users to run GPU FS mode on an X86 host using KVM mode to skip boot and other irrelevant simulation tasks.
GPU simulation is very resource intensive so care should be taken as to not simulate unnecessary code.&lt;/p&gt;

&lt;h2 id=&quot;community-affairs&quot;&gt;Community Affairs&lt;/h2&gt;

&lt;p&gt;In the summer of 2022 we held a gem5 Bootcamp at UC Davis.
The first of its kind, the Bootcamp was setup to give early-stage computer architecture researchers an opportunity to learn gem5 for an intensive 5 day course.
With over 80 applicants we selected 50 due to venue constraints and held the Bootcamp in July, hosting all attendees in UC Davis accommodation.&lt;/p&gt;

&lt;p&gt;The course was carefully designed to take the researchers, most of who were 1st year PhD students, from a very basic understanding of gem5 through to complex tasks such as adding ISA instructions, developing models, learning about various differences in gem5 CPU models, accelerating simulations, and the gem5 GPU model.
Leveraging expertise and UC Davis, we were also able to give attendees tailored advice on how to use gem5 for their particular research agendas in special 1-on-1 sessions.
In a survey taken after the event, it was found that 95% of attendees “strongly agreed” that the Bootcamp was valuable to them in getting starting with gem5 and 100% of attendees were “more likely or “much more likely” to use gem5 in their research.&lt;/p&gt;

&lt;p&gt;We have made efforts to archive the events &lt;a href=&quot;https://youtube.com/playlist?list=PL_hVbFs_loVSaSDPr1RJXP5RRFWjBMqq3&quot;&gt;via our YouTube channel&lt;/a&gt; and have encapsulated teaching materials in the &lt;a href=&quot;https://gem5bootcamp.github.io/gem5-bootcamp-env/&quot;&gt;Bootcamp website&lt;/a&gt;.
We encourage users to utilize these resources for learning gem5.&lt;/p&gt;

&lt;h3 id=&quot;future-events&quot;&gt;Future events&lt;/h3&gt;

&lt;p&gt;In the February 2023 we’ll be holding a &lt;a href=&quot;https://www.gem5.org/events/hpca-2023&quot;&gt;gem5 Tutorial at HPCA&lt;/a&gt;.
This tutorial will give attendees a 3-hour crash course in using gem5.
This will include emphasis on new gem5 features, such as that incorporated into the v22.1 release.
Based on feedback from previous tutorials, we will also include a short session on using the GPU FS model in gem5.&lt;/p&gt;

&lt;p&gt;In July (10th to the 14th), UC Davis will be hosting the 2nd gem5 Bootcamp.
With similar goals to the Bootcamp held in 2023, this event will give an intensive 5 day course for early-career computer architecture researchers, particularly those in the first two years of a PhD programme.
Please join the &lt;a href=&quot;https://www.gem5.org/ask-a-question/&quot;&gt;gem5 announce mailing list&lt;/a&gt; or keep an eye on or &lt;a href=&quot;https://www.gem5.org/events&quot;&gt;gem5 events page&lt;/a&gt; for upcoming information on registering interest for this event.&lt;/p&gt;
</description>
        <pubDate>Fri, 30 Dec 2022 00:00:00 +0000</pubDate>
        <link>https://www.gem5.org//project/2022/12/30/gem5-22-1.html</link>
        <guid isPermaLink="true">https://www.gem5.org//project/2022/12/30/gem5-22-1.html</guid>
        
        
        <category>project</category>
        
      </item>
    
      <item>
        <title>gem5-22.0 Released!</title>
        <description>&lt;p&gt;First of all, thank you to all of the contributors who made this another great gem5 release!
We’ll be talking about this release and many other cool things that have happened in the gem5 community over the past few years at the &lt;a href=&quot;https://www.gem5.org/events/isca-2022#the-4th-gem5-users-workshop&quot;&gt;gem5 workshop with ISCA 2022&lt;/a&gt;.
You can find a livestream and recording &lt;a href=&quot;https://www.youtube.com/channel/UCCpCGEj_835WYmbB0g96lZw&quot;&gt;on our youtube channel&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;gem5 version 22.0 has been slightly delayed, but we a have a very strong release!
This release has 660 changes from 48 unique contributors.
While there are not too many big ticket features, the community has done a lot to improve the stablity and add bugfixes to gem5 over this release.
That said, we have a few cool new features like full system GPU support, a huge number of Arm improvements, and an improved HBM model.&lt;/p&gt;

&lt;p&gt;See below for more details!&lt;/p&gt;

&lt;h2 id=&quot;new-features&quot;&gt;New features&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://gem5.atlassian.net/browse/GEM5-1097&quot;&gt;Arm now models DVM messages for TLBIs and DSBs accurately&lt;/a&gt;. This is implemented in the CHI protocol.&lt;/li&gt;
  &lt;li&gt;EL2/EL3 support on by default in ArmSystem&lt;/li&gt;
  &lt;li&gt;HBM controller which supports pseudo channels&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://gem5.atlassian.net/browse/GEM5-920&quot;&gt;Improved Ruby’s SimpleNetwork routing&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Added x86 bare metal workload and better real mode support&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://gem5.atlassian.net/browse/GEM5-1169&quot;&gt;Added round-robin arbitration when using multiple prefetchers&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://gem5.atlassian.net/browse/GEM5-1138&quot;&gt;KVM Emulation added for ARM GIGv3&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Many improvements to the CHI protocol&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;many-risc-v-instructions-added&quot;&gt;Many RISC-V instructions added&lt;/h2&gt;

&lt;p&gt;The following RISCV instructions have been added to gem5’s RISC-V ISA:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Zba instructions: add.uw, sh1add, sh1add.uw, sh2add, sh2add.uw, sh3add, sh3add.uw, slli.uw&lt;/li&gt;
  &lt;li&gt;Zbb instructions: andn, orn, xnor, clz, clzw, ctz, ctzw, cpop, cpopw, max, maxu, min, minu, sext.b, sext.h, zext.h, rol, rolw, ror, rori, roriw, rorw, orc.b, rev8&lt;/li&gt;
  &lt;li&gt;Zbc instructions: clmul, clmulh, clmulr&lt;/li&gt;
  &lt;li&gt;Zbs instructions: bclr, bclri, bext, bexti, binv, binvi, bset, bseti&lt;/li&gt;
  &lt;li&gt;Zfh instructions: flh, fsh, fmadd.h, fmsub.h, fnmsub.h, fnmadd.h, fadd.h, fsub.h, fmul.h, fdiv.h, fsqrt.h, fsgnj.h, fsgnjn.h, fsgnjx.h, fmin.h, fmax.h, fcvt.s.h, fcvt.h.s, fcvt.d.h, fcvt.h.d, fcvt.w.h, fcvt.h.w, fcvt.wu.h, fcvt.h.wu&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;improvements-to-the-stdlib-automatic-resource-downloader&quot;&gt;Improvements to the stdlib automatic resource downloader&lt;/h3&gt;

&lt;p&gt;The gem5 standard library’s downloader has been re-engineered to more efficiently obtain the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;resources.json&lt;/code&gt; file.
It is now cached instead of retrieved on each resource retrieval.&lt;/p&gt;

&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;resources.json&lt;/code&gt; directory has been moved to a more permament URL at &lt;a href=&quot;http://resources.gem5.org/resources.json&quot;&gt;http://resources.gem5.org/resources.json&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Tests have also been added to ensure the resources module continues to function correctly.&lt;/p&gt;

&lt;h3 id=&quot;gem5-in-systemc-support-revamped&quot;&gt;gem5 in SystemC support revamped&lt;/h3&gt;

&lt;p&gt;The gem5 in SystemC has been revamped to accomodate new research needs.
These changes include stability improvements and bugs fixes.
The gem5 testing suite has also been expanded to include gem5 in SystemC tests.&lt;/p&gt;

&lt;h3 id=&quot;improved-gpu-support&quot;&gt;Improved GPU support.&lt;/h3&gt;

&lt;p&gt;Users may now simulate an AMD GPU device in full system mode using the ROCm 4.2 compute stack.
Until v21.2, gem5 only supported GPU simulation in Syscall-Emulation mode with ROCm 4.0.
See &lt;a href=&quot;https://gem5.googlesource.com/public/gem5-resources/+/refs/heads/stable/src/gpu-fs/&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;src/gpu-fs/README.md&lt;/code&gt;&lt;/a&gt; in gem5-resources and example scripts in &lt;a href=&quot;https://gem5.googlesource.com/public/gem5/+/refs/tags/v22.0.0.0/configs/example/gpufs/&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;configs/example/gpufs/&lt;/code&gt;&lt;/a&gt; for example scripts which run GPU full system simulations.&lt;/p&gt;

&lt;p&gt;A &lt;a href=&quot;https://gem5-review.googlesource.com/c/public/gem5/+/59272&quot;&gt;GPU Ruby random tester has been added&lt;/a&gt; to help validate the correctness of the CPU and GPU Ruby coherence protocols as part of every kokoro check-in.
This helps validate the correctness of the protocols before new changes are checked in.
Currently the tester focuses on the protocols used with the GPU, but the ideas are extensible to other protocols.
The work is based on “Autonomous Data-Race-Free GPU Testing”, IISWC 2019, Tuan Ta, Xianwei Zhang, Anthony Gutierrez, and Bradford M. Beckmann.&lt;/p&gt;

&lt;h3 id=&quot;an-arm-board-has-been-added-to-the-gem5-standard-library&quot;&gt;An Arm board has been added to the gem5 Standard Library&lt;/h3&gt;

&lt;p&gt;Via &lt;a href=&quot;https://gem5-review.googlesource.com/c/public/gem5/+/58910&quot;&gt;this change&lt;/a&gt;, an ARM Board, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ArmBoard&lt;/code&gt;, has been added to the gem5 standard library.
This allows for an ARM system to be run using the gem5 stdlib components.&lt;/p&gt;

&lt;p&gt;An example gem5 configuration script using this board can be found in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;configs/example/gem5_library/arm-ubuntu-boot-exit.py&lt;/code&gt;.&lt;/p&gt;

&lt;h3 id=&quot;createaddrranges-now-supports-numa-configurations&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;createAddrRanges&lt;/code&gt; now supports NUMA configurations&lt;/h3&gt;

&lt;p&gt;When the system is configured for NUMA, it has multiple memory ranges, and each memory range is mapped to a corresponding NUMA node. For this, the change enables &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;createAddrRanges&lt;/code&gt; to map address ranges to only a given HNFs.&lt;/p&gt;

&lt;p&gt;Jira ticker here: https://gem5.atlassian.net/browse/GEM5-1187.&lt;/p&gt;

&lt;h2 id=&quot;api-user-facing-changes&quot;&gt;API (user-facing) changes&lt;/h2&gt;

&lt;h3 id=&quot;cpu-model-types-are-no-longer-simply-the-model-name-but-they-are-specialized-for-each-isa&quot;&gt;CPU model types are no longer simply the model name, but they are specialized for each ISA&lt;/h3&gt;

&lt;p&gt;For instance, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;O3CPU&lt;/code&gt; is now the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;X86O3CPU&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ArmO3CPU&lt;/code&gt;, etc.
This requires a number of changes if you have your own CPU models.
See https://gem5-review.googlesource.com/c/public/gem5/+/52490 for details.&lt;/p&gt;

&lt;p&gt;Additionally, this requires changes in any configuration script which inherits from the old CPU types.&lt;/p&gt;

&lt;p&gt;In many cases, if there is only a single ISA compiled the old name will still work.
However, this is not 100% true.&lt;/p&gt;

&lt;p&gt;Finally, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;CPU_MODELS&lt;/code&gt; is no longer a parameter in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;build_opts/&lt;/code&gt;.
Now, if you want to compile a CPU model for a particular ISA you will have to add a new file for the CPU model in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;arch/&lt;/code&gt; directory.&lt;/p&gt;

&lt;h3 id=&quot;many-changes-in-the-cpu-and-isa-apis&quot;&gt;Many changes in the CPU and ISA APIs&lt;/h3&gt;

&lt;p&gt;If you have any specialized CPU models or any ISAs which are not in the mainline, expect many changes when rebasing on this release.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;No longer use read/setIntReg (e.g., see https://gem5-review.googlesource.com/c/public/gem5/+/49766)&lt;/li&gt;
  &lt;li&gt;InvalidRegClass has changed (e.g., see https://gem5-review.googlesource.com/c/public/gem5/+/49745)&lt;/li&gt;
  &lt;li&gt;All of the register classes have changed (e.g., see https://gem5-review.googlesource.com/c/public/gem5/+/49764/)&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;initiateSpecialMemCmd&lt;/code&gt; renamed to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;initiateMemMgmtCmd&lt;/code&gt; to generalize to other command beyond HTM (e.g., DVM/TLBI)&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;OperandDesc&lt;/code&gt; class added (e.g., see https://gem5-review.googlesource.com/c/public/gem5/+/49731)&lt;/li&gt;
  &lt;li&gt;Many cases of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;TheISA&lt;/code&gt; have been removed&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;bug-fixes&quot;&gt;Bug Fixes&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://gem5-review.googlesource.com/c/public/gem5/+/58209&quot;&gt;Fixed RISC-V call/ret instruction decoding&lt;/a&gt;. The fix adds IsReturn&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt; and &lt;/code&gt;IsCall&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt; flags for RISC-V jump instructions by defining a new &lt;/code&gt;JumpConstructor` in “standard.isa”. Jira Ticket here: https://gem5.atlassian.net/browse/GEM5-1139.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://gem5-review.googlesource.com/c/public/gem5/+/55744&quot;&gt;Fixed x86 Read-Modify-Write behavior in multiple timing cores with classic caches&lt;/a&gt;. Jira Ticket here: https://gem5.atlassian.net/browse/GEM5-1105.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://gem5-review.googlesource.com/c/public/gem5/+/58649&quot;&gt;The circular buffer for the O3 LSQ has been fixed&lt;/a&gt;. This issue affected running the O3 CPU with large workloaders. Jira Ticket here: https://gem5.atlassian.net/browse/GEM5-1203.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://gem5-review.googlesource.com/c/public/gem5/+/55663&quot;&gt;Removed “memory-leak”-like error in RISC-V lr/sc implementation&lt;/a&gt;. Jira issue here: https://gem5.atlassian.net/browse/GEM5-1170.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://gem5-review.googlesource.com/c/public/gem5/+/56811&quot;&gt;Resolved issues with Ruby’s memtest&lt;/a&gt;. In gem5 v21.2, If the size of the address range was smaller than the maximum number of outstandnig requests allowed downstream, the tester would get stuck trying to find a unique address. This has been resolved.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;build-related-changes&quot;&gt;Build-related changes&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;Variable in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;env&lt;/code&gt; in the SConscript files now requires you to use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;env[&apos;CONF&apos;]&lt;/code&gt; to access them. Anywhere that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;env[&apos;&amp;lt;VARIABLE&amp;gt;&apos;]&lt;/code&gt; appeared should noe be &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;env[&apos;CONF&apos;][&apos;&amp;lt;VARIABLE&amp;gt;&apos;]&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;Internal build files are now in a per-target &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gem5.build&lt;/code&gt; directory&lt;/li&gt;
  &lt;li&gt;All build variable are per-target and there are no longer any shared variables.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;other-changes&quot;&gt;Other changes&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;New bootloader is required for Arm VExpress_GEM5_Foundation platform. See https://gem5.atlassian.net/browse/GEM5-1222 for details.&lt;/li&gt;
  &lt;li&gt;The MemCtrl interface has been updated to use more inheritance to make extending it to other memory types (e.g., HBM pseudo channels) easier.&lt;/li&gt;
&lt;/ul&gt;
</description>
        <pubDate>Sat, 18 Jun 2022 00:00:00 +0000</pubDate>
        <link>https://www.gem5.org//project/2022/06/18/gem5-22-0.html</link>
        <guid isPermaLink="true">https://www.gem5.org//project/2022/06/18/gem5-22-0.html</guid>
        
        
        <category>project</category>
        
      </item>
    
      <item>
        <title>The Case for Using Guix to Solve the gem5 Packaging Problem</title>
        <description>&lt;p&gt;&lt;sup&gt;1&lt;/sup&gt; School of Electrical and Computer Engineering, Cornell University, Ithaca, NY &lt;br /&gt;
&lt;sup&gt;2&lt;/sup&gt; The University of Tennessee Health Science Center, Memphis, TN &lt;br /&gt;
&lt;sup&gt;3&lt;/sup&gt; ElenQ Technology&lt;/p&gt;

&lt;p&gt;This post will first describe the gem5 packaging problem before making
the case for using Guix, a mature functional cross-platform package
manager, for building, distributing, installing, and managing the gem5
ecosystem.&lt;/p&gt;

&lt;h2 id=&quot;the-gem5-packaging-problem&quot;&gt;The gem5 Packaging Problem&lt;/h2&gt;

&lt;p&gt;The gem5 simulator has become the defacto standard for cycle-level
simulation, and gem5 now supports evaluating a diverse set of workloads.
Unfortunately, both the gem5 simulator and gem5 workloads still lack a
compelling software packaging solution to simplify building,
distributing, installing, and managing the gem5 ecosystem. In this
section, we describe the gem5 simulator and workload packaging problems
before sketching an ideal software packaging solution.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The gem5 Simulator Packaging Problem –&lt;/strong&gt; The gem5 simulator is a
complex piece of software with numerous build- and run-time dependencies
including a modern C++ compiler, SCons, Boost, and Python. To mitigate
dependency issues, the gem5 installation instructions strongly recommend
using specific versions of Ubuntu. The gem5 simulator has numerous
compile-time options to experiment with different ISAs, coherence
protocols, and/or accelerators, and this in turn makes providing a single
precompiled binary difficult. Even so, the gem5 community does point new
researchers to a small number of precompiled Docker images. Given these
challenges, the gem5 community has chosen not to support any kind of
packaging for the gem5 simulator. Almost all researchers individually
manage dependencies and recompile the gem5 simulator from source.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The gem5 Workload Packaging Problem –&lt;/strong&gt; Building workloads to run on
the gem5 simulator using syscall emulation can be just as challenging as
building the gem5 simulator itself. These workloads must be
cross-compiled meaning researchers must build a complete
cross-compilation toolchain for each target architecture. Researchers
might also need to build an emulator (e.g., QEMU) to test these workloads
before moving to cycle-level simulation. Researchers must ensure the
workloads only use static libraries and do not call any unsupported
syscalls. Given these challenges, the gem5 community directly included
precompiled binaries as part of the gem5 source distribution for many
years. More recently, the community has migrated to pre-compiled binaries
as part of the gem5 resources project.&lt;/p&gt;

&lt;p&gt;An ideal software packaging solution would be: &lt;em&gt;reproducible –&lt;/em&gt; easily
duplicate precisely specified development environments; &lt;em&gt;transparent –&lt;/em&gt;
understand entire development environment including exact build
configuration and version of every dependency; &lt;em&gt;composable –&lt;/em&gt; easily
integrate the gem5 simulator and workload into standard development
environments without needing cumbersome, heavyweight containers; &lt;em&gt;fast –&lt;/em&gt;
leverage precompiled packages; &lt;em&gt;distribution agnostic –&lt;/em&gt; enable
researchers to use the Linux distribution of their choice; &lt;em&gt;unified –&lt;/em&gt;
same packaging solution can be used for both the gem5 simulator and
workloads; &lt;em&gt;portable –&lt;/em&gt; easily build gem5 workloads for native execution
and/or target multiple ISAs for cycle-level simulation without manually
managing multiple cross-compilation toolchains; and &lt;em&gt;flexible –&lt;/em&gt; easily
switch between development environments, modify existing packages, add
new packages, produce reproducible workflows, and/or generate containers.&lt;/p&gt;

&lt;h2 id=&quot;using-guix-for-gem5&quot;&gt;Using Guix for gem5&lt;/h2&gt;

&lt;p&gt;Guix is a mature functional cross-platform package manager with hundreds
of committers and over 20K packages. In this section, we briefly describe
our on-going efforts to use Guix for packaging both the gem5 simulator
and gem5 workloads.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Using Guix to Package gem5 Simulators –&lt;/strong&gt; We have developed a
proof-of-concept Guix package for the gem5 simulator that handles all
build- and run-time dependencies and installation&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;. The package builds
gem5 for six ISAs, ensures builds are reproducible by eliminating
non-deterministic use of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;|__DATE__|&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;|__TIME__|&lt;/code&gt;, patches the build
environment to work with SCons, and performs a well-structured install of
the gem5 simulator binaries and example configurations. The package is:
&lt;em&gt;reproducible&lt;/em&gt; through the use of isolated and deterministic
environments for experimenting with the gem5 simulator; &lt;em&gt;transparent&lt;/em&gt;,
since the complete recursive dependency graph is precisely specified;
&lt;em&gt;composable&lt;/em&gt;, since the gem5 simulator is installed just like any
other tool without the need for a container; and &lt;em&gt;distribution
agnostic&lt;/em&gt;, since the package can be installed on Ubuntu, RHEL, SUSE, or
even Guix System which is an entire distribution based exclusively on
Guix. Derived packages could enable easily providing packages for
different compile-time configurations. When merged upstream into the main
Guix package repository, this package will be part of the Guix build farm
enabling binary package substitution for &lt;em&gt;fast&lt;/em&gt; installation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Using Guix to Package gem5 Workloads –&lt;/strong&gt; Guix already includes
packages for QEMU and cross-compilation toolchains for commercial ISAs.
We are contributing to the RISC-V port of Guix including packaging the
RISC-V cross-compilation toolchain. We identified a simple Smith-Waterman
sequence alignment Guix package as an interesting gem5 workload and
developed a derived Guix package that patches the standard build process
to produce a statically linked binary&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;. In addition to being
&lt;em&gt;reproducible&lt;/em&gt;, &lt;em&gt;transparent&lt;/em&gt;, &lt;em&gt;composable&lt;/em&gt;, and &lt;em&gt;distribution agnostic&lt;/em&gt;,
using Guix is also &lt;em&gt;portable&lt;/em&gt;, since we can easily cross-compile the
package for ARM or RISC-V, and &lt;em&gt;flexible&lt;/em&gt;, since it requires only eight
lines of Guile code to create a new derived package supporting static
compilation.&lt;/p&gt;

&lt;h2 id=&quot;case-study&quot;&gt;Case Study&lt;/h2&gt;

&lt;p&gt;The attached appendix describes step-by-step commands for a case study
that: installs QEMU, gem5, and cross-compilers for x86, ARM, and RISC-V
in an isolated environment; cross-compiles and runs Smith-Waterman for
all three ISAs; and runs this benchmark on the in-order and out-of-order
timing models. To reproduce this case study, a researcher first must
download and install Guix&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;h3 id=&quot;add-new-channel&quot;&gt;Add new channel&lt;/h3&gt;

&lt;p&gt;By default, Guix includes its own main package repository, but users can
also create their own ``channels’’ that include third-party packages. We
need to add such a channel to get access to the gem5 simulator package
and the derived Smith-Waterman package.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt; % cd $HOME/.config/guix
 % cat &amp;gt; channels.scm \
&amp;lt;&amp;lt;&apos;END&apos;
(use-modules (guix ci))
(list
 (channel
  (name &apos;gn-bioinformatics)
  (url (string-append &quot;https://git.genenetwork.org/&quot;
         &quot;guix-bioinformatics/guix-bioinformatics.git&quot;))
  (branch &quot;master&quot;))
 (channel-with-substitutes-available
  %default-guix-channel &quot;https://ci.guix.gnu.org&quot;))
END
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;update-guix-and-install-smith-waterman&quot;&gt;Update Guix and install Smith-Waterman&lt;/h3&gt;

&lt;p&gt;We use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;guix pull&lt;/code&gt; to download all of the package descriptions from the
main package repository along with any third-party packages. We then
install the default Smith-Waterman package and run it natively. Here we
use the default “profile”, but we could also install this package in a
dedicated Guix “profile”, similar to Python’s virtual environment.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;% mkdir -p $HOME/tmp/misc/test-guix
% cd $HOME/tmp/misc/test-guix
% guix pull
% guix install smithwaterman
% smithwaterman -p TGATTGTACCAAA TGATCATGTACCA
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;install-qemu-and-gem5&quot;&gt;Install QEMU and gem5&lt;/h3&gt;

&lt;p&gt;We now install both the QEMU and gem5 packages for all architectures in
the same profile.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;% guix install qemu
% guix install gem5
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;build-and-run-smith-waterman-for-x86_64-isa&quot;&gt;Build and run Smith-Waterman for x86_64 ISA&lt;/h3&gt;

&lt;p&gt;We use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;guix build --target=x86_64-linux-gnu&lt;/code&gt; to cross-compile
most Guix packages for x86_64. Here we cross-compile the derived package
for Smith-Waterman which produces a statically linked executable that we
then run on both QEMU and gem5.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;% cd $HOME/tmp/misc/test-guix
% DIR=$(guix build \
         --target=x86_64-linux-gnu smithwaterman-static)
% ln -sf $DIR/bin/smithwaterman sw-x86_64
% qemu-x86_64 ./sw-x86_64 -p TGATTGTACCAAA TGATCATGTACCA
% gem5-x86.opt \
    $GUIX_PROFILE/share/gem5/configs/example/se.py \
    --cmd=./sw-x86_64 \
    --options=&quot;-p TGATTGTACCAAA TGATCATGTACCA&quot;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;build-and-run-smith-waterman-for-arm-isa&quot;&gt;Build and run Smith-Waterman for ARM ISA&lt;/h3&gt;

&lt;p&gt;We use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;guix build --target=aarch64-linux-gnu&lt;/code&gt; to cross-compile
most Guix packages for ARM. Here we cross-compile the derived package for
Smith-Waterman which produces a statically linked executable that we then
run on both QEMU and gem5.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;% cd $HOME/tmp/misc/test-guix
% DIR=$(guix build \
         --target=aarch64-linux-gnu smithwaterman-static)
% ln -sf $DIR/bin/smithwaterman sw-aarch64
% qemu-aarch64 ./sw-aarch64 -p TGATTGTACCAAA TGATCATGTACCA
% gem5-arm.opt \
    $GUIX_PROFILE/share/gem5/configs/example/se.py \
    --cmd=./sw-aarch64 \
    --options=&quot;-p TGATTGTACCAAA TGATCATGTACCA&quot;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;build-and-run-smith-waterman-for-risc-v-isa&quot;&gt;Build and run Smith-Waterman for RISC-V ISA&lt;/h3&gt;

&lt;p&gt;We use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;guix build --target=riscv64-linux-gnu&lt;/code&gt; to cross-compile
most Guix packages for RISC-V. Here we cross-compile the derived package for
Smith-Waterman which produces a statically linked executable that we then
run on both QEMU and gem5.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;% cd $HOME/tmp/misc/test-guix
% DIR=$(guix build \
         --target=riscv64-linux-gnu smithwaterman-static)
% ln -sf $DIR/bin/smithwaterman sw-riscv64
% qemu-riscv64 ./sw-riscv64 -p TGATTGTACCAAA TGATCATGTACCA
% gem5-riscv.opt \
    $GUIX_PROFILE/share/gem5/configs/example/se.py \
    --cmd=./sw-riscv64 \
    --options=&quot;-p TGATTGTACCAAA TGATCATGTACCA&quot;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;run-experiment-on-risc-v-isa&quot;&gt;Run experiment on RISC-V ISA&lt;/h3&gt;

&lt;p&gt;Once we have used Guix to install the gem5 simulator and the gem5
workload packages, we can easily perform a computer architecture research
experiment. Here we compare the performance of running Smith-Waterman on
an in-order vs.~out-of-order RISC-V processor model.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;% cd $HOME/tmp/misc/test-guix

% gem5-riscv.opt \
    --outdir=m5out-minor-sw \
    $GUIX_PROFILE/share/gem5/configs/example/se.py \
    --cmd=./sw-riscv64 \
    --options=&quot;-p TGATTGTACCAAA TGATCATGTACCA&quot; \
    --cpu-type=MinorCPU --ruby

% gem5-riscv.opt \
    --outdir=m5out-o3-sw \
    $GUIX_PROFILE/share/gem5/configs/example/se.py \
    --cmd=./sw-riscv64 \
    --options=&quot;-p TGATTGTACCAAA TGATCATGTACCA&quot; \
    --cpu-type=O3CPU --ruby

% grep system.cpu.numCycles m5out-minor-sw/stats.txt
% grep system.cpu.numCycles m5out-o3-sw/stats.txt
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;acknowledgements&quot;&gt;Acknowledgements&lt;/h2&gt;

&lt;p&gt;This work was supported by NSF PPoSS Award #2118709 and NLNet awards for
GNUMes-RISCV and Guix-Riscv64.&lt;/p&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://git.genenetwork.org/guix-bioinformatics/guix-bioinformatics/src/branch/master/gn/packages/virtualization.scm&quot;&gt;https://git.genenetwork.org/guix-bioinformatics/guix-bioinformatics/src/branch/master/gn/packages/virtualization.scm&lt;/a&gt; &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://git.genenetwork.org/guix-bioinformatics/guix-bioinformatics/src/branch/master/gn/packages/static.scm&quot;&gt;https://git.genenetwork.org/guix-bioinformatics/guix-bioinformatics/src/branch/master/gn/packages/static.scm&lt;/a&gt; &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://guix.gnu.org/en/download&quot;&gt;https://guix.gnu.org/en/download&lt;/a&gt; &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;
</description>
        <pubDate>Mon, 23 May 2022 00:00:00 +0000</pubDate>
        <link>https://www.gem5.org//project/2022/05/23/guix.html</link>
        <guid isPermaLink="true">https://www.gem5.org//project/2022/05/23/guix.html</guid>
        
        
        <category>project</category>
        
      </item>
    
      <item>
        <title>Support for LupIO devices in gem5</title>
        <description>&lt;h2 id=&quot;the-case-for-a-comprehensive-and-open-source-collection-of-io-devices&quot;&gt;The case for a comprehensive and open-source collection of I/O devices&lt;/h2&gt;

&lt;p&gt;As all the gem5 users probably know, researchers in computer architecture often
need to build a full hardware system in order to, for example, experiment a
novel micro-architectural approach. In such a scenario, creating I/O devices for
the system is typically not the main goal but only a necessary step, and
researchers will therefore look for the easiest hardware designs to implement.
Unfortunately, they may still face multiple difficulties in the process.&lt;/p&gt;

&lt;p&gt;First, even if many hardware designs are widely available in existing computing
systems and therefore well-supported by software stacks, their specifications
may still be difficult to implement: e.g., a 16550-compatible UART, an IDE disk
storage, etc. Second, many hardware specifications are proprietary, which forces
researchers to abide by their licensing terms. Last, even if they are able to
find some open-source specifications that are easy to implement, it will only
apply to a few devices within the designed full system.&lt;/p&gt;

&lt;p&gt;Researchers in systems software face very similar challenges, but from the side
of device drivers. When building novel operating systems, the development of a
few device drivers is also a necessary step that researchers should optimize.
However, developing drivers for typical devices can often be difficult (e.g., a
16550-compatible UART, an IDE disk storage, etc.).&lt;/p&gt;

&lt;p&gt;Our LupIO’s collection of devices aims to bridge that gap. By providing a
comprehensive and open-source collection of I/O devices, that are powerful
enough to build complex multicore systems and yet straightforward to implement,
LupIO can help researchers in computer architecture and systems software perform
exploratory research more easily. Additionally, LupIO can be used as a teaching
asset, as it is accessible to students at the undergraduate and graduate level.&lt;/p&gt;

&lt;h2 id=&quot;overview-of-lupio&quot;&gt;Overview of LupIO&lt;/h2&gt;

&lt;p&gt;LupIO is a comprehensive and open-source collection of education-friendly I/O
devices. This collection defines the interfaces of the most common devices found
in modern RISC-based computers, and makes it possible to build complete systems
using only LupIO devices, even complex symmetric multiprocessor (SMP) systems.&lt;/p&gt;

&lt;p&gt;LupIO includes core devices (such as an interrupt controller or a timer) as well
as general I/O devices (such as a block device, a real-time clock, or a
terminal). LupIO devices are intended to be processor-agnostic, so they should
be usable with any processor architecture supporting memory-mapped devices
(e.g., RISC-V, ARM, MIPS, etc.).&lt;/p&gt;

&lt;p&gt;Each device interface is designed to be simple and clear, with an optimal
balance between features and complexity. The register maps exposed by the
devices are neatly organized by type (e.g., data, control, and status) and
arranged consistently across devices, in order to ease their programmability.
Developing implementations of LupIO devices, as well as corresponding device
drivers, is meant to be straightforward.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/img/lupio-register-maps.svg&quot; alt=&quot;LupIO register maps&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The full specifications of LupIO devices are available at
&lt;a href=&quot;https://gitlab.com/luplab/lupio/lupio-specs&quot;&gt;https://gitlab.com/luplab/lupio/lupio-specs&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;implementation-in-gem5&quot;&gt;Implementation in gem5&lt;/h2&gt;

&lt;p&gt;Following a proof-of-concept implementation in QEMU
(&lt;a href=&quot;https://gitlab.com/luplab/lupio/qemu&quot;&gt;https://gitlab.com/luplab/lupio/qemu&lt;/a&gt;), Jason Lowe-Power and I decided to join
forces and have the LupIO collection be ported to gem5, where it would reach
more of the comparch/systems research community.&lt;/p&gt;

&lt;p&gt;Last summer, we hired two talented undergraduate students from UC Davis, Laura
Hinman and Melissa Jost, to work on this implementation.&lt;/p&gt;

&lt;p&gt;They successfully implemented all eight devices of the LupIO collection
(&lt;a href=&quot;https://gem5.googlesource.com/public/gem5/+/refs/tags/v21.2.0.0/src/dev/lupio/&quot;&gt;https://gem5.googlesource.com/public/gem5/+/refs/tags/v21.2.0.0/src/dev/lupio/&lt;/a&gt;)
and created an example board, called &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LupV&lt;/code&gt;, based around a RISC-V processor
(&lt;a href=&quot;https://gem5.googlesource.com/public/gem5/+/refs/tags/v21.2.0.0/src/python/gem5/components/boards/experimental/lupv_board.py&quot;&gt;https://gem5.googlesource.com/public/gem5/+/refs/tags/v21.2.0.0/src/python/gem5/components/boards/experimental/lupv_board.py&lt;/a&gt;).
This experimental board can boot Linux and even has SMP support!
The source for the bootloader/kernel and disk image resources used in this example board can be found &lt;a href=&quot;https://gem5.googlesource.com/public/gem5-resources/+/refs/heads/stable/src/lupv&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/img/lupio-linux-boot.png&quot; alt=&quot;Linux boot&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;While only a RISC-V based board was created, LupIO should technically be
processor-agnostic. We are currently working to prove that by building other
boards based on other processors and only embedding LupIO devices (e.g.,
currently trying with an ARM32).&lt;/p&gt;

&lt;p&gt;You can also use any of the LupIO devices individually in your hardware system.
The purely peripheral I/O devices, such as the real-time clock, the terminal,
etc., should work in any MMIO-capable systems.&lt;/p&gt;
</description>
        <pubDate>Mon, 07 Feb 2022 00:00:00 +0000</pubDate>
        <link>https://www.gem5.org//project/2022/02/07/lupio.html</link>
        <guid isPermaLink="true">https://www.gem5.org//project/2022/02/07/lupio.html</guid>
        
        
        <category>project</category>
        
      </item>
    
      <item>
        <title>gem5-21.2 Released!</title>
        <description>&lt;p&gt;We are proud to announce version 21.2 of the the gem5 project.
In this release we incorporated 790 commits from 33 unique authors, new and regular, from both academia and industry.
We are, as always, thankful to all the time our community puts into maintaining and improving gem5.&lt;/p&gt;

&lt;h2 id=&quot;212-highlights&quot;&gt;21.2 Highlights&lt;/h2&gt;

&lt;h3 id=&quot;enhanced-standard-library&quot;&gt;Enhanced Standard Library&lt;/h3&gt;

&lt;p&gt;Having existed since v21.1 under the now deprecated name “the components library”, the v21.2 release of gem5 moves the gem5 standard library out of alpha.
The purpose of the gem5 standard library is to provide gem5 users a standard set of commonly used components and utilities to aid them in their research.
Our overarching goal with the standard library is to remove “boilerplate” code from gem5 configuration files;
making the 95% of activities that rarely change from simulation-to-simulation available in an “off-the-shelf” manner to users.
As an example, a users wishing to experiment with the effects of cache sizes can used the gem5 standard library to setup a processor, memory system, and test on sensible benchmarks, thus freeing them to focus completely on the impact of cache size changes.&lt;/p&gt;

&lt;p&gt;The gem5 standard library is a provided as Python package which contains the following:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Components&lt;/strong&gt;: A set of Python classes which wrap gem5’s models.
Some of the components are pre-configured to match real hardware (e.g., &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SingleChannelDDR3_1600&lt;/code&gt;) and others are parameterized.
Components can be combined together into &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;boards&lt;/code&gt; which can be simulated.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Resources&lt;/strong&gt;: A set of utilities to obtain and incorporate resources (disk images, applications, kernels, etc.) into gem5 simulations.
Using this module allows you to &lt;em&gt;automatically&lt;/em&gt; download and use many of gem5’s prebuilt resources (e.g., kernels and disk images) from &lt;a href=&quot;https://www.gem5.org/documentation/general_docs/gem5_resources&quot;&gt;gem5-resources&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Simulate&lt;/strong&gt;: Used to interface with gem5’s simulation/run capabilities.
&lt;strong&gt;Note: This package is in beta.
Expect API changes to this package in future releases.
Feedback is appreciated.&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Prebuilt&lt;/strong&gt;: These are fully functioning prebuilt systems (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;boards&lt;/code&gt;) to use directly in gem5 simulations with minimal setup.
This release includes an &lt;a href=&quot;https://gem5.googlesource.com/public/gem5/+/refs/tags/v21.2.0.0/src/python/gem5/prebuilt/demo/x86_demo_board.py&quot;&gt;X86 demo board&lt;/a&gt; and an &lt;a href=&quot;https://gem5.googlesource.com/public/gem5/+/refs/tags/v21.2.0.0/configs/example/gem5_library/x86-ubuntu-run.py&quot;&gt;example of how it may be used&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Usage of the gem5 standard library is optional.
It does not change any established gem5 API, or how gem5 configuration scripts may be created.
gem5 configuration scripts that functioned in v21.1 should continue to function in v21.2.
We do, however, hope the gem5 library can aid users in creating simulations, as is the case with all libraries.&lt;/p&gt;

&lt;p&gt;Users can find example configurations scripts that incorporate the gem5 standard library in the gem5 repository’s &lt;a href=&quot;https://gem5.googlesource.com/public/gem5/+/refs/tags/v21.2.0.0/configs/example/gem5_library&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;configs/example/gem5_library&lt;/code&gt;&lt;/a&gt; directory.&lt;/p&gt;

&lt;p&gt;As an example of how simple the gem5 standard library can make running a gem5 simulation, consider the following script:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.prebuilt.demo.x86_demo_board&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;X86DemoBoard&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.resources.resource&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Resources&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;gem5.simulate.simulator&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Simulators&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# Here we setup the board. The prebuilt X86DemoBoard allows for Full-System X86
# simulation.
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;board&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;X86DemoBoard&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# We then set the workload. Here we use the 5.4.49 Linux kernel with an X86
# Ubuntu OS. If these cannot be found locally they will be automatically
# downloaded.
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;board&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;set_kernel_disk_workload&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;kernel&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Resource&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;x86-linux-kernel-5.4.49&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;disk_image&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Resource&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;x86-ubuntu-18.04-img&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# We then setup the Simulator and run the simulation.
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;simulator&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Simulator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;board&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;board&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;simulator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This script can be executed with&lt;/p&gt;

&lt;div class=&quot;language-sh highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;scons build/X86/gem5.opt
./build/X86/gem5.opt &amp;lt;script&amp;gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The script will automatically obtain the correct linux kernel and a disk image containing Ubuntu 18.04 from gem5-resources (if not already present on the host system).
It will then run a full-system X86 simulation to a complete boot of the operating system, then exit.
Prior to the introduction of the gem5 standard library, a user would have to put in considerable effort to build such a simulation (100s of lines of python).&lt;/p&gt;

&lt;p&gt;While we hope we have designed the standard library in an intuitive manner, users may reference the source under &lt;a href=&quot;https://gem5.googlesource.com/public/gem5/+/refs/tags/v21.2.0.0/src/python/gem5&quot;&gt;src/python/gem5&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In the coming month we will be updating the gem5 website with new tutorials and documentation on using the gem5 standard library.&lt;/p&gt;

&lt;h4 id=&quot;future-work-on-the-gem5-standard-library&quot;&gt;Future work on the gem5 standard library&lt;/h4&gt;

&lt;p&gt;Over the next few gem5 releases we will be expanding the standard library to include more components and features.
An big goal of ours is to provide prebuilt components and systems that are proven to be representative of real-world counterparts.&lt;/p&gt;

&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Simulate&lt;/code&gt; module &lt;a href=&quot;https://gem5.atlassian.net/browse/GEM5-1125&quot;&gt;will be expanded, improved, and moved out of beta state&lt;/a&gt; as its role in the gem5 standard library becomes more clear.&lt;/p&gt;

&lt;p&gt;If you wish to report a bug in the gem5 standard library or have a feature request, please submit it to gem5’s &lt;a href=&quot;https://gem5.atlassian.net/&quot;&gt;Jira site&lt;/a&gt;.
Questions regarding usage of the standard library can be made to the &lt;a href=&quot;https://lists.gem5.org/postorius/lists/gem5-users.gem5.org/&quot;&gt;gem5 user’s mailing list&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&quot;lupio-friendly-io-devices-for-gem5&quot;&gt;LupIO: Friendly IO Devices for gem5.&lt;/h3&gt;

&lt;p&gt;LupIO devices were developed by &lt;a href=&quot;https://faculty.engineering.ucdavis.edu/porquet&quot;&gt;Prof. Joel Porquet-Lupine&lt;/a&gt; as a set of open-source I/O devices to be used for teaching.
They were designed to model a complete set of I/O devices that are neither too complex to teach in a classroom setting, or too simple to translate to understanding real-world devices.
A goal of two undergraduate students at UC Davis, Melissa Jost and Laura Hinman, was to work on incorporating LupIO devices into gem5.
As such the gem5 v21.2 release includes a LupIO real-time clock, a random number generator, a terminal device, a block device, a system controller, a timer device, a programmable interrupt controller, and an inter-processor interrupt controller.&lt;/p&gt;

&lt;p&gt;A more detailed outline of LupIO can be found in Prof. Porquet-Lupine’s paper &lt;a href=&quot;https://luplab.cs.ucdavis.edu/assets/lupio/wcae21-porquet-lupio-paper.pdf&quot;&gt;“LupIO: a collection fo education-friendly I/O devices”&lt;/a&gt; and information on the wider LupLab research group can be found on &lt;a href=&quot;https://luplab.cs.ucdavis.edu/&quot;&gt;their website&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Users wishing to try out LupIO devices can find an example script and README file in the &lt;a href=&quot;https://gem5.googlesource.com/public/gem5/+/refs/tags/v21.2.0.0/configs/example/lupv&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;configs/example/lupv&lt;/code&gt;&lt;/a&gt; directory.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; These LupIO devices have been built and tested for RISC-V.
However, there is no reason these couldn’t be modified to work with other ISA targets if required or desired.
We welcome further development by the gem5 community.&lt;/p&gt;

&lt;h3 id=&quot;arm-improvements&quot;&gt;Arm improvements&lt;/h3&gt;

&lt;p&gt;In continued and welcome collaboration with Arm Holdings, improvements to gem5 Arm implementations have been made.
They are:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://gem5.atlassian.net/browse/GEM5-1132&quot;&gt;Improved configurability for Arm architectural extensions&lt;/a&gt;: We have improved how  architectural extensions are enabled/disabled for an Arm system.
Rather than working with independent boolean values, we now use a unified &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ArmRelease&lt;/code&gt; object which models the architectural features supported by a FS/SE Arm simulation.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://gem5.atlassian.net/browse/GEM5-1108&quot;&gt;Arm TLB can store partial entries&lt;/a&gt;: It is now possible to configure an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ArmTLB&lt;/code&gt; as a walk cache which stores intermediate PAs obtained during a translation table walk.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://gem5.atlassian.net/browse/GEM5-790&quot;&gt;Implemented a multilevel LB hierarchy&lt;/a&gt;: Users can now compose/model a customizable multilevel TLB hierarchy in gem5.
The default Arm MMU now has an Instruction LA TLB, a Data L1 TLB, and Unified (Instruction + Data) L2 TLB.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://gem5.atlassian.net/browse/GEM5-1121&quot;&gt;Provided an Arm example script for the gem5-SST integration&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;gpu-improvements&quot;&gt;GPU Improvements&lt;/h3&gt;

&lt;p&gt;Continued efforts by, primarily, AMD, Inc. and the University of Wisconsin have improved gem5’s GPU support.
In this release:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Vega support&lt;/strong&gt;: gfx900 (Vega) discrete GPUs are now both supported and tested with &lt;a href=&quot;https://gem5.googlesource.com/public/gem5-resources/+/refs/heads/stable/src/gpu/&quot;&gt;gem5-resources applications&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Additional GPU applications&lt;/strong&gt;: The &lt;a href=&quot;https://github.com/pannotia/pannotia&quot;&gt;Pannotia graph analytics benchmark suite&lt;/a&gt; has been added to gem5-resources, including Makefiles, READMEs, and sample commands on how to run each application in gem5.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Regression Testing&lt;/strong&gt;: Several GPU applications are now tested as part of the nightly and weekly regressions, which improves test coverage and avoids introducing inadvertent bugs.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Minor updates to the architectural model&lt;/strong&gt;: Small changes and fixes have been made to the HSA queue size (to allow larger GPU applications with many kernels to run), and the TLB (to create GCN4- and Vega-specific TLBs). We have also added new instructions that were previously unimplemented in GCN4 and Vega, and fixed corner cases for some instructions that were leading to incorrect behavior.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;gem5-sst-bridge-revived&quot;&gt;gem5-SST bridge revived&lt;/h3&gt;

&lt;p&gt;In recent versions of gem5, we sadly lost the ability to integrate with the &lt;a href=&quot;http://sst-simulator.org&quot;&gt;Structural Simulation Toolkit&lt;/a&gt; (SST).
In collaboration with the SST community, we have revived support for connecting gem5 cores to the SST memory system.
In v21.2 release, this has been tested for RISC-V and Arm.
More information on setting up and running gem5 with SST can be found in &lt;a href=&quot;https://gem5.googlesource.com/public/gem5/+/refs/tags/v21.2.0.0/ext/sst/README.md&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ext/sst/README.md&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;newchanged-apis&quot;&gt;New/Changed APIS&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;[API CHANGE]&lt;/strong&gt;: All &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SimObject&lt;/code&gt; declarations in SConscript files now require a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sim_objects&lt;/code&gt; parameter that lists all SimObject classes declared in that file which need C++ wrappers (that is, SimObject classes which have a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;type&lt;/code&gt; attribute defined).&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;[NEW CHANGE]&lt;/strong&gt;: There is now an optional &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;enums&lt;/code&gt; parameter for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SimObject&lt;/code&gt; classes which must list all the Enum types defined in that SimObject file.
Technically, this should only include Enum types which generate C++ wrappers though, as of v21.2, all Enums do so.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;other-v212-improvements&quot;&gt;Other v21.2 improvements&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;The master/slave terminology has been removed.
This has been an on-going effort for several gem5 releases.
The gem5 codebase is now free of its usage.&lt;/li&gt;
  &lt;li&gt;Arm v8.2-A FEAT_UAO has been implemented.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://gem5.atlassian.net/browse/GEM5-1098&quot;&gt;The “at” variants of the file system call have been implemented in SE mode&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;The SConscripts have been refactored for improved modularity.&lt;/li&gt;
  &lt;li&gt;New “tester” CPUs have been introduced which mimic GUPS.&lt;/li&gt;
&lt;/ul&gt;
</description>
        <pubDate>Tue, 28 Dec 2021 00:00:00 +0000</pubDate>
        <link>https://www.gem5.org//project/2021/12/28/gem5-21-2.html</link>
        <guid isPermaLink="true">https://www.gem5.org//project/2021/12/28/gem5-21-2.html</guid>
        
        
        <category>project</category>
        
      </item>
    
  </channel>
</rss>
