Exercise - Synthesis
Overview
In this exercise, you will learn how to go from an implementation in a hardware description language (HDL) like Verilog, SystemVerilog or VHDL to functionally equivalent gate-level netlist (a Verilog file containing only instantiating of standard cells and macros and wires connecting the cells). The cells and macros are provided by the foundry in a collection called standard cell library which contains all the gates that can be manufactured in a fully digital design for the target technology. The standard cell library together with all other technology- specific data needed to design a chip form the process design kit (PDK). In our case this is the open-source IHP130 PDK available on Github.
The process going from HDL to a netlist is called ’'Synthesis'’ and from a more abstract point of view, it is the last step of what is commonly called the ’Front-End’ of the design process.
The leading open-source synthesis tool is called Yosys. In this exercise you will learn the needed steps to perform synthesis, both from a more general perspective and also specific to Yosys. In doing so, you will be guided through the whole process of synthesizing Croc-SoC. Figure 1 gives a graphical overview of the subject of this exercise.
If you want to know more about how Yosys came to be and/or see the synthesis process using simple examples, you can visit Yosys’ online documentation.
About the style
We will use a number of different styles to identify different types of actions as shown below:
Student Task
Actions that require you to select a specific menu will be shown like the following: menu → sub-menu → sub-sub-menu
Whenever there is an option or a tab that can be found in the current view/menu we will use a BUTTON to indicate such an option.
Throughout the exercise you will be asked to enter certain commands using the command-line.
sh> command to be entered on the Linux command line
Some of the commands will be entered on the command line of a specific tool, and will be highlighted by a different prompt in this case ys>
for Yosys.
ys> some_yosys_command
Getting Started
Student Task 1 (Setup)
Setup the working directory for this exercise by calling our install script and changing to the newly created directory:
sh> /home/efcl_004fs24/ex04_synthesis/install.sh sh> cd ex04_synthesis/croc-soc sh> source tardis_env.sh
Note: If you ever need to open a new Terminal you will need to source tardis_env.sh
again. It makes it so all tools are available in the new shell.
Synthesis: An Overview
Before delving into Yosys synthesis, lets first look at what steps an abstract synthesis tool has to perform. You can then later on try and match the different commands ran in Yosys to one of these points.
- Parse Design
- The first step is to load the behavioral description, in our case SystemVerilog, into the synthesis tool. The synthesis tool needs to parse the HDL language features and convert it to an internal representation. Usually this involves first parsing the design into an abstract syntax tree (AST), a representation that closely resembles how the HDL language is structured. Then in a second step the AST is converted into a representation more suited to synthesis. In this representation we are no longer interested in how a certain operation is implemented, instead we care about what it does and how it connects to other operations. At this point the design is still behavioral (it contains code-like features which can implement something in an abstract way).
- Elaborate Design
- The next step is to take the behavioral design and convert it to a structural design. While a behavioral model may contain abstract things like loops or conditional branching, a structural model is very close to a netlist. The only thing it contains are cells (arithmetic operations, finite-state-machines etc) and connections between cells. Any parameters in the design (ie the size of a data-bus) can be set to their final values and the corresponding values will be propagated to the subdesigns during elaboration.
- Logic Optimization
- With different techniques the structural representation of design can be optimized. This means the length of the longest paths is reduced, redundant operations are removed and the output shared, constant bits are propagated and so on.
- Mapping
- Interwoven with the logic optimization are mapping steps. In mapping we take one or more cells and convert them into another representation. In practice this usually means we start with cells representing very high-level operations (like an arbitrary arithmetic operation) and using a mapping, we transform it to a lower-level representation (like logic gates). During the last steps of synthesis we must map the internal representation to the standard cells provided by the PDK, this is called technology mapping.
In commercial tools you would also supply a set of constraints describing the timings your design needs to meet (for example different clock nets might use a different clock period). During the whole process, the tool tries to attain the goals you set. It tries to fulfill the timing constraints you set and on all non-critical paths it is then able to reclaim logic area by finding more compact representations of the same function (which usually means they are slower).
Currently, the open-source tools do not support constraints beyond a single target frequency so we will not further explore this aspect.
Yosys Synthesis
In the previous section you have learned what steps are necessary in order to synthesize a design described in SystemVerilog. We will now go through the synthesis process step by step using Yosys.
But first, SystemVerilog is a very complex HDL and most open-source tools currently do not support all or most of its language features, see the chipsalliance test suite. Yosys currently only supports a small subset of SystemVerilog, while our PULP platform is almost entirely in SystemVerilog. To bridge this gap, we created a pipeline of three tools to stepwise go from SystemVerilog to Verilog:
- Bender
- Manages the dependencies (repos) and creates the necessary scripts with all file paths for further processing.
- Morty
- Collects the files from Bender and pickles them all into one single file and compile context (it also resolves pre-processor instructions)
- SVase
- Simplifies the SystemVerilog code by unrolling generate statements and propagating all parameters to their local contexts (likely the most complicated part of SystemVerilog elaboration, in our experience only Slang is able to consistently get this right; SVase builds on Slang).
- SV2V
- Handles the bulk of converting SystemVerilog features to Verilog.
Student Task 2
For your convenience we already ran this pipeline, you can re-run it at any point using make pickle
.
- Open
pickle/croc_morty.sv
and select a module with parameters (eg.module ibex_core
). We recommend you just open the entirecroc-soc
directory in Sublime Text
sh > sublime_text .
- Open
pickle/croc_svase.sv
and search for the same module, what do you observe?
- Again in
pickle/croc_morty.sv
search forgen_regfile_ff
. Depending on the value of a parameter a different register file (module) is used. Search forgen_regfile_ff
inpickle/croc_svase.sv
, what do you observe?
- Which step of synthesis does a change like this belong to?
- It is part of elaboration, hence SVase is called a ’pre-elaborator’
- Finally, in
pickle/croc_svase.sv
andpickle/croc_sv2v.v
search forp_encode
and compare the two.
Parsing
Now let us start with the actual synthesis in Yosys.
During this exercise, you will execute a lot of commands and sometimes things might fail and you would have to redo everything. Yosys does have a command history ↑, but usually you will want to execute a bunch of commands, so we recommend you create your own script and add the commands in there. Yosys can read two types of script files. The first just contains Yosys commands (file ending .ys
) and you can run them using the command script
in yosys. The second one is a tcl-script (file ending .tcl
) which is able to access Yosys commands from the name space Yosys. You can run them inside Yosys using the command tcl
.
Student Task 3
- First, lets start the provided synthesis flow so we can later inspect the results if we run out of time, open a second Terminal and navigate to the same position you are currently in (pwd shows you where you are), then start the flow:
sh > make run-yosys
- In the other Terminal, source the environment script, go into the
synth
directory and start Yosys
sh> source tardis_env.sh sh> cd synth sh> yosys-0.35 yosys
- First you need to load our technology data, we already prepared this for you.
ys > script scripts/init_tech.ys
- Now you can load the design into Yosys
ys > read_verilog ../pickle/croc_sv2v.v
- Closely read the first few lines of the Yosys output, which internal representations did Yosys just use and what could be their purpose?
- In the following steps we will want to closely follow only one module to see what exactly Yosys does. We recommend you follow one of the
delta_counter__*
modules as they are relatively simple and contain some control logic, arithmetic operations and a register (flip-flops).
- Open your selected module in one of the SystemVerilog/Verilog files in the
pickle
directory. We recommendpickle/croc_svase.sv
since all parameters are propagated but the description is still pretty high-level.
- What does your selected module do?
- To quickly find your module in Yosys, you can save a selection for it using:
ys > select -set delta t:<your_module> %M
- Where
your_module
is the full name of your module, so for examplet:delta_counter__10356954905401074824
.
This saves a selection containing only your module under the name delta
. If you want to perform a command only on your module, you then use
ys > command @delta
Note: The select command in Yosys is extremely powerful, you can see some more use-cases in scripts/yosys_synthesis.tcl and see the full syntax in the Yosys documentation.
Print and Display
At this point you probably want to know what exactly Yosys did to your module, in fact we recommend you regularly do this in the following tasks (remember, put it in a script for easy reuse).
There are a few commands in Yosys which are useful to debug problems you have. The simplest of them is to simply write out Verilog:
ys> select @delta ys> write_verilog -norename -noexpr -noattr -selected delta.v ys> select -clear
Unless you know exactly what the above does, we recommend you use exactly this combination of commands and flags and just put it into some script (eg scripts/print.ys
).
The next command just prints statistics including how many of which cell is in the design.
ys > stat @delta
And finally Yosys can generate schematics showing the selected network visually. This is something you should only use on smaller selections as it quickly becomes unwieldy.
ys> show -colors -width -stretch -long -prefix delta -format pdf @delta
-prefix
sets the path and basename of the file it is saved to. You may want to explore the flags of this command and make it your own.
You can print the manpage of a command using:
ys > help <cmd>
Elaboration
Equipped with a nice script to print our module we can move on to elaboration.
Student Task 4
- Explore the Verilog output of your module, which parts have already been elaborated and which haven’t?
- Let Yosys check and clean-up the design hierarchy
ys > hierarchy -check -top croc_chip
- If it can’t find certain modules, technology macros or other things, it would tell you here.
- Now we want a fully elaborated design, convert the always blocks to a structural representation
ys > proc
- Again explore the output using the methods described above. What changed?
High-Level optimization
You may have noticed that all the cells we currently have in our design start with a $
and the names are all lower-case. This is characteristic of the high-level (or abstract) representation in Yosys. High-level because single cells can represent large and complex operations such as division.
The most common optimization commands in Yosys start with opt_*
. There is also the larger opt
pass which executes a series of opt_*
sub-commands. The four most important way to call opt are:
opt
: The pass with the default settingsopt -noff
: Same as above but does not perform any flip-flop related optimizations (integrating enable
signals, constant removal etc)
opt -fast
: Runs an alternative sequence of sub-commands that may not be quite as good but doesn’t take as longopt -full
: Adds some additional optimizations
As a rough guideline the closer we get to a gate-level representation the more cells we will have in our design and the costlier the commands are. So early on you can use the more powerful commands anytime you wish but later in the synthesis process you will want to use them a bit more sparingly.
Let's run some optimizations on the current representation.
Student Task 5
- Execute the following optimization commands
ys> opt_expr ys> opt_clean ys> opt -noff ys> fsm ys> opt
- Here we also have the
fsm
command, it finds, extracts and optimizes state-machines.
- Lets continue optimizing
ys> wreduce ys> peepopt ys> opt_clean ys> opt -full
wreduce
reduces the width (number of bits) of operations,peepopt
performs some operator-specific optimizations and then we again perform general optimizations.
Mapping
At this point of the flow we will start mapping high-level cells to lower level implementations. Every arithmetic operation can be implemented in a number of different ways (architectures). Currently, Yosys mostly picks a good compromise and sticks with it. Choosing different architectures depending on the circumstances is one of the next big things being worked on.
Student Task 6a
The next two commands are mapping commands.
ys> booth ys> alumacc ys> share ys> opt
booth
implements multipliers using Booth-encoding (we don’t have any real multipliers) and alumacc
collects various multiply, add and subtract operations together into $alu
and then $macc
(multiply-add) cells, increasing resource sharing.
Note:
At this point you may notice that booth
and alumacc
do not work well together since both commands turn multipliers into other cells. This will be improved in the future. As a general rule using booth
can result in a faster design but using only alumacc
can reduce the area.
Student Task 6b
- Next you need to map memories to flip-flops and then optimize unnecessary flip-flops away.
ys> memory ys> opt -fast ys> opt_dff -sat -nodffe -nosdff ys> share ys> opt -full ys> clean -purge
At this point we are about to use the most powerful mapping command in Yosys. techmap
can take a Verilog implementation of any cell currently in Yosys and replace these cells with the implementation described in the Verilog mapping file. It has a default mapping file (which we will use) but if you are not happy with one of the architectural choices, you can use any other architecture and run it before using the default techmap
mapper.
For us this means the abstraction level of our representation is about to change drastically. Currently it contains high-level concepts like arithmetic operators but after techmap
they will be implemented using generic gates.
There is also another command called extract
, it can match subcircuits and replace them with a custom cell you define. These two commands together can be very useful if you have some optimized implementation and want to use it in your design. There is a tutorial in the documentation.
Student Task 7
- Before running
techmap
make sure you have astat
andshow/write_verilog output
copied somewhere so we can compare it - Run the following commands (this might take a few minutes)
ys > techmap ys> opt -fast ys > clean -purge
- Compare the type and number of cells used from before
techmap
to after
In the main flow the next step would be to flatten the design. This means the module instantiations are replaced with a copy of the content of the module. You are able to keep certain modules by using setattr -set keep_hierarchy 1 module|instance
.
Flattening more of the design makes it harder to debug and write constraints but it increases the potential for additional optimizations.
Currently Yosys does not perform any cross-boundary optimization (propagate information like logic functions through the instantiation of a module), this makes flattening as much as possible even more important.
Student Task 8
- You probably want to continue follow your module so lets make sure we keep it
setattr -set keep_hierarchy 1 t:your_module
Where your_module
is the full name of your module, so for example t:delta_counter__10356954905401074824
- Now lets flatten the rest
ys > flatten ys > clean -purge
- You can run
hierarchy
to make sure your module is still there (it first shows the kept modules, then the ones being removed) - At this point it can be a good idea to split buses on module ports into individual nets and make sure flip-flop and other cells have usable names. This is done for compatibility reasons (
splitnets
) and to make it easier to debug and write constraints (rename
andautoname
).
ys> splitnets -ports -format __v ys> rename -wire -suffix _reg t:*DFF* ys> autoname t:*DFF* %n ys> clean -purge
ABC Logic Optimization & Mapping
Yosys can internally use another tool called ABC to run strong logic optimization algorithms and map to standard cells. Similarly to Yosys, ABC also requires a script describing which commands it should execute. There are a bunch of different ABC scripts out there for example the default scripts used in Yosys, the OpenROAD-flow-script area and speed oriented scripts and also the scripts generated by OpenLANE.
Today you are going to use the ABC script used to synthesize the open-source Linux capable Basilisk.
The script itself is documented in a IWLS contribution. The shortest possible description is that it uses a logic optimization technique called Lazy Man’s Synthesis, which cuts your design into small functions and then replaces them using implementations from a pre-computed table. Then followed by mapping the design to the standard cells
Student Task 9
- Before calling ABC, you first need to map the sequential elements to standard cells
ys > dfflibmap -liberty ../ihp13/pdk/ihp-sg13g2/libs.ref sg13g2_stdcell/lib/sg13g2_stdcell_typ_1p20V_25C.lib
- You can find the above path in the
scripts/init_tech.ys
script.
• Now we can call ABC
ys> abc -D 10000 -script scripts/abc-opt.script -liberty ../ihp13/pdk\ /ihp-sg13g2/libs.ref/sg13g2_stdcell/lib/sg13g2_stdcell_typ_1p20V_25C.lib -showtmp ys > clean -purge
- The
-D ####
flag sets the target period in picoseconds, this is mostly used in the final mapping to
standard cells and the buffering and resizing at the end of the ABC script.
- Congratulations! At this point you have a netlist. Before we write it out we need to deal with constant values
ys> setundef -zero ys> hilomap -singleton -hicell sg13g2_tiehi L_HI -locell sg13g2_tielo L_LO ys> write_verilog -noattr -noexpr -nohex -nodec netlist.v
A last note
In the last section we simplified things a bit. You don’t need to map the flip-flops before calling ABC. If you map the flip-flops before calling ABC, it will only receive the combinational elements, even if you use abc -dff
. If you want to perform sequential optimizations in ABC, you need to execute ABC before running dfflibmap
but sequential synthesis currently has some major downsides/problems.
The biggest one is the following:
Even in the combinational case, ABC works strictly per-module which decreases the optimization potential since the module boundaries cannot be changed. In the sequential case this problem gets even worse because here ABC will receive each clock-domain in each module. This might seem reasonable (and it is to some extent) but the problem comes from what is considered to be a clock-domain. Every unique combination of enable, reset and clock wire going to a flip-flop is a clock domain. the clock is obvious, reset is also mostly the same among all flip-flops but in one module you may have many different enable signals for different flip-flops. Another big downside is that you lose the naming of the flip-flops and you often cannot recover them, this is a major problem in peripherals where you might need the names to write constraints.
So in practice, you first to prepare the generic flip-flops in Yosys by mapping enable signals and synchronous resets into soft logic. There are some other things you need to deal with for compatibility reasons, they are explained in platform/cheshire-ihp130-o/blob/basilisk-dev/target/ihp13/yosys/scripts/yosys_synthesis.tcl#L172 Basilisks synthesis script. You probably also do not want to run it on your whole design. Instead you may want to consider splitting your design into a synchronous part where you want sequential optimization and everything else (clock domain crossings, peripherals and so on) where you do not want it. Then you run synthesis twice with either option and blackbox the stuff you do not want in this run.
You are done with this Exercise.
Discuss your experience with assistants and collegues or continue to explore the provided design on your own.
The VLSI pages are part of the open source VLSI design course offered by the Integrated Systems Laboratory of ETH Zürich, by Luca Benini and Frank K. Gürkaynak. See full list of contributors.