Exercise - Synthesis: Difference between revisions

From Antalya
Jump to: navigation, search
(Created page with "{{VLSInavbar}} ==Overview== In this exercise, We will learn: * how to * how to === About the style === We will use a number of different styles to identify different types of actions as shown below: {{VLSItaskhead|}} Parts of the text that have a gray background, like the current paragraph, indicate steps required to complete the exercise.</div> {{VLSItaskfoot}} Actions that require you to select a specific menu will be shown like the following: <code>menu →...")
 
No edit summary
 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{VLSInavbar}}
{{VLSInavbar}}
==Overview==
==Overview==
In this exercise,  
In this exercise, you will learn how to go from an implementation in a hardware description language (HDL) like Verilog, SystemVerilog or VHDL to functionally equivalent gate-level netlist (a Verilog file containing only instantiating of standard cells and macros and wires connecting the cells). The cells and macros are provided by the foundry in a collection called standard cell library which contains all the gates that can be manufactured in a fully digital design for the target technology. The standard cell library together with all other technology- specific data needed to design a chip form the process design kit (PDK). In our case this is the open-source [https://github.com/IHP-GmbH/IHP-Open-PDK IHP130 PDK available on Github].
 
The process going from HDL to a netlist is called ’'Synthesis'’ and from a more abstract point of view, it is the last step of what is commonly called the ’Front-End’ of the design process.
 
The leading open-source synthesis tool is called [[Yosys]]. In this exercise you will learn the needed steps to perform synthesis, both from a more general perspective and also specific to [[Yosys]]. In doing so, you will be guided through the whole process of synthesizing [[Croc]]-SoC.
Figure 1 gives a graphical overview of the subject of this exercise.
 
{{VLSIfigure|design_flow_synthesis.png|Excerpt of the ASIC design flow with the subject of this exercise highlighted.|1}}
 
 
If you want to know more about how [[Yosys]] came to be and/or see the synthesis process using simple examples, you can visit [https://yosyshq.readthedocs.io/projects/yosys/en/latest/introduction.html Yosys’ online documentation].


We will learn:
* how to 
* how to


=== About the style ===
=== About the style ===
Line 23: Line 30:
  sh> command to be entered on the Linux command line
  sh> command to be entered on the Linux command line


Some of the commands will be entered on the command line of a specific tool, and will be highlighted by a different prompt in this case <code>ys></code> for [[Yosys]].
ys> some_yosys_command


=== Getting Started ===  
=== Getting Started ===  
{{VLSItaskhead|1 (Setup)}}
Setup the working directory for this exercise by calling our install script and changing to the newly created directory:
sh> /home/efcl_004fs24/ex04_synthesis/install.sh
sh> cd ex04_synthesis/croc-soc
sh> source tardis_env.sh
{{VLSItaskfoot}} 
<div class="alert alert-warning" role="alert">
'''Note:''' If you ever need to open a new Terminal you will need to source <code>tardis_env.sh</code> again. It makes it so all tools are available in the new shell.
</div>
=== Synthesis: An Overview ===
Before delving into [[Yosys]] synthesis, lets first look at what steps an abstract synthesis tool has to perform. You can then later on try and match the different commands ran in [[Yosys]] to one of these points.
;Parse Design:
:The first step is to load the behavioral description, in our case [[SystemVerilog]], into the synthesis tool. The synthesis tool needs to parse the HDL language features and convert it to an internal representation. Usually this involves first parsing the design into an abstract syntax tree (AST), a representation that closely resembles how the HDL language is structured. Then in a second step the AST is converted into a representation more suited to synthesis. In this representation we are no longer interested in how a certain operation is implemented, instead we care about what it does and how it connects to other operations. At this point the design is still behavioral (it contains code-like features which can implement something in an abstract way).
;Elaborate Design:
:The next step is to take the behavioral design and convert it to a structural design. While a behavioral model may contain abstract things like loops or conditional branching, a structural model is very close to a netlist. The only thing it contains are cells (arithmetic operations, finite-state-machines etc) and connections between cells. Any parameters in the design (ie the size of a data-bus) can be set to their final values and the corresponding values will be propagated to the subdesigns during elaboration.
;Logic Optimization:
: With different techniques the structural representation of design can be optimized. This means the length of the longest paths is reduced, redundant operations are removed and the output shared, constant bits are propagated and so on.
;Mapping:
: Interwoven with the logic optimization are mapping steps. In mapping we take one or more cells and convert them into another representation. In practice this usually means we start with cells representing very high-level operations (like an arbitrary arithmetic operation) and using a mapping, we transform it to a lower-level representation (like logic gates). During the last steps of synthesis we must map the internal representation to the standard cells provided by the [[PDK]], this is called technology mapping.
In commercial tools you would also supply a set of constraints describing the timings your design needs to meet (for example different clock nets might use a different clock period). During the whole process, the tool tries to attain the goals you set. It tries to fulfill the timing constraints you set and on all non-critical paths it is then able to reclaim logic area by finding more compact representations of the same function (which usually means they are slower).
Currently, the open-source tools do not support constraints beyond a single target frequency so we will not further explore this aspect.
=== Yosys Synthesis ===
In the previous section you have learned what steps are necessary in order to synthesize a design described in [[SystemVerilog]]. We will now go through the synthesis process step by step using [[Yosys]].
But first, [[SystemVerilog]] is a very complex HDL and most open-source tools currently do not support all or most of its language features, see the [https://chipsalliance.github.io/sv-tests-results chipsalliance test suite]. [[Yosys]] currently only supports a small subset of [[SystemVerilog]], while our [https://github.com/pulp-platform PULP platform] is almost entirely in [[SystemVerilog]]. To bridge this gap, we created a pipeline of three tools to stepwise go from SystemVerilog to Verilog:
;Bender:
: Manages the dependencies (repos) and creates the necessary scripts with all file paths for further processing.
;Morty:
:Collects the files from Bender and pickles them all into one single file and compile context (it also resolves pre-processor instructions)
;SVase:
:Simplifies the [[SystemVerilog]] code by unrolling generate statements and propagating all parameters to their local contexts (likely the most complicated part of SystemVerilog elaboration, in our experience only [https://github.com/MikePopoloski/slang?tab=readme-ov-file Slang] is able to consistently get this right; SVase builds on Slang).
;SV2V
:Handles the bulk of converting [[SystemVerilog]] features to Verilog.
{{VLSItaskhead|2}}
For your convenience we already ran this pipeline, you can re-run it at any point using <code>make pickle</code>.
* Open <code>pickle/croc_morty.sv</code> and select a module with parameters (eg. <code>module ibex_core</code>). We recommend you just open the entire <code>croc-soc</code> directory in Sublime Text
sh > sublime_text .
* Open <code>pickle/croc_svase.sv</code> and search for the same module, what do you observe?
* Again in <code>pickle/croc_morty.sv</code> search for <code>gen_regfile_ff</code>. Depending on the value of a parameter a different register file (module) is used. Search for <code>gen_regfile_ff</code> in <code>pickle/croc_svase.sv</code>, what do you observe?
*  Which step of synthesis does a change like this belong to?
:  It is part of elaboration, hence SVase is called a ’pre-elaborator’
* Finally, in <code>pickle/croc_svase.sv</code> and <code>pickle/croc_sv2v.v</code> search for <code>p_encode</code> and compare the two.
{{VLSItaskfoot}} 
==== Parsing ====
Now let us start with the actual synthesis in [[Yosys]].
During this exercise, you will execute a lot of commands and sometimes things might fail and you would have to redo everything. [[Yosys]] does have a command history <kbd>↑</kbd>, but usually you will want to execute a bunch of commands, so we recommend you create your own script and add the commands in there. [[Yosys]] can read two types of script files. The first just contains [[Yosys]] commands (file ending <code>.ys</code>) and you can run them using the command <code>script</code> in yosys. The second one is a tcl-script (file ending <code>.tcl</code>) which is able to access [[Yosys]] commands from the name space [[Yosys]]. You can run them inside [[Yosys]] using the command <code>tcl</code>.
{{VLSItaskhead|3}}
* First, lets start the provided synthesis flow so we can later inspect the results if we run out of time, open a second Terminal and navigate to the same position you are currently in (<kbd>pwd</kbd> shows you where you are), then start the flow:
sh > make run-yosys
* In the other Terminal, source the environment script, go into the <code>synth</code> directory and start Yosys
sh> source tardis_env.sh
sh> cd synth
sh> yosys-0.35 yosys
* First you need to load our technology data, we already prepared this for you.
ys > script scripts/init_tech.ys
* Now you can load the design into [[Yosys]]
ys > read_verilog ../pickle/croc_sv2v.v
* Closely read the first few lines of the [[Yosys]] output, which internal representations did [[Yosys]] just use and what could be their purpose?
* In the following steps we will want to closely follow only one module to see what exactly [[Yosys]] does. We recommend you follow one of the <code>delta_counter__*</code> modules as they are relatively simple and contain some control logic, arithmetic operations and a register (flip-flops).
* Open your selected module in one of the SystemVerilog/Verilog files in the <code>pickle</code> directory. We recommend <code>pickle/croc_svase.sv</code> since all parameters are propagated but the description is still pretty high-level.
* What does your selected module do?
* To quickly find your module in Yosys, you can save a selection for it using:
ys > select -set delta t:<your_module> %M
* Where <code>your_module</code> is the full name of your module, so for example <code>t:delta_counter__10356954905401074824</code>.
This saves a selection containing only your module under the name <code>delta</code>. If you want to perform a command only on your module, you then use
  ys > command @delta
{{VLSItaskfoot}} 


<div class="alert alert-warning" role="alert">
'''Note:''' The select command in Yosys is extremely powerful, you can see some more use-cases in scripts/yosys_synthesis.tcl and see the [https://yosyshq.readthedocs.io/projects/yosys/en/latest/cmd/select.html full syntax in the Yosys documentation].
</div>


==== Print and Display ====
At this point you probably want to know what exactly [[Yosys]] did to your module, in fact we recommend you regularly do this in the following tasks (remember, put it in a script for easy reuse).


There are a few commands in [[Yosys]] which are useful to debug problems you have. The simplest of them is to simply ''write out Verilog'':
ys> select @delta
ys> write_verilog -norename -noexpr -noattr -selected delta.v
ys> select -clear


Unless you know exactly what the above does, we recommend you use exactly this combination of commands and flags and just put it into some script (eg <code>scripts/print.ys</code>).


The next command just ''prints statistics'' including how many of which cell is in the design.
ys > stat @delta
And finally Yosys can generate schematics showing the selected network visually. This is something you should only use on smaller selections as it quickly becomes unwieldy.
ys> show -colors -width -stretch -long -prefix delta -format pdf @delta
<code> -prefix </code> sets the path and basename of the file it is saved to. You may want to explore the flags of this command and make it your own.
You can print the manpage of a command using:
ys > help <cmd>
==== Elaboration ====
Equipped with a nice script to print our module we can move on to elaboration.
{{VLSItaskhead|4}}
* Explore the Verilog output of your module, which parts have already been elaborated and which haven’t?
* Let [[Yosys]] check and clean-up the design hierarchy
ys > hierarchy -check -top croc_chip
: If it can’t find certain modules, technology macros or other things, it would tell you here.
* Now we want a fully elaborated design, convert the always blocks to a structural representation
ys > proc
* Again explore the output using the methods described above. What changed?
{{VLSItaskfoot}} 
==== High-Level optimization ====
You may have noticed that all the cells we currently have in our design start with a <code>$</code> and the names are all lower-case. This is characteristic of the high-level (or abstract) representation in [[Yosys]]. High-level because single cells can represent large and complex operations such as division.
The most common optimization commands in Yosys start with <code>opt_*</code>. There is also the larger <code>opt</code> pass which executes a series of <code>opt_*</code> sub-commands. The four most important way to call opt are:
* '''<code>opt</code>''': The pass with the default settings
* '''<code>opt -noff</code>''': Same as above but does not perform any flip-flop related optimizations (integrating enable
signals, constant removal etc)
* '''<code>opt -fast</code>''': Runs an alternative sequence of sub-commands that may not be quite as good but doesn’t take as long
* '''<code>opt -full</code>''': Adds some additional optimizations
As a rough guideline the closer we get to a gate-level representation the more cells we will have in our design and the costlier the commands are. So early on you can use the more powerful commands anytime you wish but later in the synthesis process you will want to use them a bit more sparingly.
Let's run some optimizations on the current representation.
{{VLSItaskhead|5}}
* Execute the following optimization commands
ys> opt_expr
ys> opt_clean
ys> opt -noff
ys> fsm
ys> opt
: Here we also have the <code>fsm</code> command, it finds, extracts and optimizes state-machines.
* Lets continue optimizing
ys> wreduce
ys> peepopt
ys> opt_clean
ys> opt -full
: <code>wreduce</code> reduces the width (number of bits) of operations, <code>peepopt</code> performs some operator-specific optimizations and then we again perform general optimizations.
{{VLSItaskfoot}} 
==== Mapping ====
At this point of the flow we will start mapping high-level cells to lower level implementations. Every arithmetic operation can be implemented in a number of different ways (architectures). Currently, Yosys mostly picks a good compromise and sticks with it. Choosing different architectures depending on the circumstances is one of the next big things being worked on.
{{VLSItaskhead|6a}}
The next two commands are mapping commands.
ys> booth
ys> alumacc
ys> share
ys> opt
<code>booth</code> implements multipliers using Booth-encoding (we don’t have any real multipliers) and <code>alumacc</code> collects various multiply, add and subtract operations together into <code>$alu</code> and then <code>$macc</code> (multiply-add) cells, increasing resource sharing.
{{VLSItaskfoot}} 
<div class="alert alert-warning" role="alert">
'''Note:'''
At this point you may notice that <code>booth</code> and <code>alumacc</code> do not work well together since both commands turn multipliers into other cells. This will be improved in the future. As a general rule using <code>booth</code> can result in a faster design but using only <code>alumacc</code> can reduce the area.
</div>
{{VLSItaskhead|6b}}
* Next you need to map memories to flip-flops and then optimize unnecessary flip-flops away.
ys> memory
ys> opt -fast
ys> opt_dff -sat -nodffe -nosdff
ys> share
ys> opt -full
ys> clean -purge
{{VLSItaskfoot}} 
At this point we are about to use the most powerful mapping command in [[Yosys]]. <code>techmap</code> can take a Verilog implementation of any cell currently in [[Yosys]] and replace these cells with the implementation described in the Verilog mapping file. It has a default mapping file (which we will use) but if you are not happy with one of the architectural choices, you can use any other architecture and run it before using the default <code>techmap</code> mapper.
For us this means the abstraction level of our representation is about to change drastically. Currently it contains high-level concepts like arithmetic operators but after <code>techmap</code> they will be implemented using generic gates.
There is also another command called <code>extract</code>, it can match subcircuits and replace them with a custom cell you define. These two commands together can be very useful if you have some optimized implementation and want to use it in your design. There is a [https://yosyshq.readthedocs.io/projects/yosys/en/latest/using_yosys/synthesis/extract.html tutorial in the documentation].
{{VLSItaskhead|7}}
* Before running <code>techmap</code> make sure you have a <code>stat</code> and <code>show/write_verilog output</code> copied somewhere so we can compare it
* Run the following commands (this might take a few minutes)
ys > techmap
ys> opt -fast
ys > clean -purge
* Compare the type and number of cells used from before <code>techmap</code> to after
{{VLSItaskfoot}} 
In the main flow the next step would be to ''flatten'' the design. This means the module instantiations are replaced with a copy of the content of the module. You are able to keep certain modules by using <code>setattr -set keep_hierarchy 1 module|instance </code>.
Flattening more of the design makes it harder to debug and write constraints but it increases the potential for additional optimizations.
Currently [[Yosys]] does not perform any cross-boundary optimization (propagate information like logic functions through the instantiation of a module), this makes flattening as much as possible even more important.
{{VLSItaskhead|8}}
* You probably want to continue follow your module so lets make sure we keep it <code>setattr -set keep_hierarchy 1 t:your_module</code>
Where <code>your_module</code> is the full name of your module, so for example <code>t:delta_counter__10356954905401074824</code>
* Now lets flatten the rest
ys > flatten
ys > clean -purge
* You can run <code>hierarchy</code> to make sure your module is still there (it first shows the kept modules, then the ones being removed)
* At this point it can be a good idea to split buses on module ports into individual nets and make sure flip-flop and other cells have usable names. This is done for compatibility reasons (<code>splitnets</code>) and to make it easier to debug and write constraints (<code>rename</code> and <code>autoname</code>).
ys> splitnets -ports -format __v
ys> rename -wire -suffix _reg t:*DFF*
ys> autoname t:*DFF* %n
ys> clean -purge
{{VLSItaskfoot}} 
==== ABC Logic Optimization & Mapping ====
[[Yosys]] can internally use another tool called [[ABC]] to run strong logic optimization algorithms and map to standard cells.
Similarly to [[Yosys]], [[ABC]] also requires a script describing which commands it should execute. There are a bunch of different ABC scripts out there for example the default scripts used in [[Yosys]], the [[OpenROAD]]-flow-script area and speed oriented scripts and also the scripts generated by OpenLANE.
Today you are going to use the [[ABC]] script used to synthesize the [https://arxiv.org/pdf/2405.03523 open-source Linux capable Basilisk].
The script itself is documented in a [https://arxiv.org/pdf/2405.04257 IWLS contribution]. The shortest possible description is that it uses a logic optimization technique called ''Lazy Man’s Synthesis'', which cuts your design into small functions and then replaces them using implementations from a pre-computed table. Then followed by mapping the design to the standard cells
{{VLSItaskhead|9}}
* Before calling [[ABC]], you first need to map the sequential elements to standard cells
ys > dfflibmap -liberty ../ihp13/pdk/ihp-sg13g2/libs.ref sg13g2_stdcell/lib/sg13g2_stdcell_typ_1p20V_25C.lib
: You can find the above path in the <code>scripts/init_tech.ys</code> script.
• Now we can call ABC
ys> abc -D 10000 -script scripts/abc-opt.script -liberty ../ihp13/pdk\ /ihp-sg13g2/libs.ref/sg13g2_stdcell/lib/sg13g2_stdcell_typ_1p20V_25C.lib -showtmp
ys > clean -purge
: The <code>-D ####</code> flag sets the target period in picoseconds, this is mostly used in the final mapping to
standard cells and the buffering and resizing at the end of the [[ABC]] script.
* Congratulations! At this point you have a netlist. Before we write it out we need to deal with constant values
ys> setundef -zero
ys> hilomap -singleton -hicell sg13g2_tiehi L_HI -locell sg13g2_tielo L_LO
ys> write_verilog -noattr -noexpr -nohex -nodec netlist.v
{{VLSItaskfoot}} 


=== A last note ===
In the last section we simplified things a bit. You don’t need to map the flip-flops before calling [[ABC]]. If you map the flip-flops before calling [[ABC]], it will only receive the combinational elements, even if you use <code>abc -dff</code>. If you want to perform sequential optimizations in [[ABC]], you need to execute [[ABC]] before running <code>dfflibmap</code> but sequential synthesis currently has some major downsides/problems.


The biggest one is the following:


Even in the combinational case, [[ABC]] works strictly per-module which decreases the optimization potential since the module boundaries cannot be changed. In the sequential case this problem gets even worse because here [[ABC]] will receive each clock-domain in each module. This might seem reasonable (and it is to some extent) but the problem comes from what is considered to be a clock-domain. Every unique combination of enable, reset and clock wire going to a flip-flop is a clock domain. the clock is obvious, reset is also mostly the same among all flip-flops but in one module you may have many different enable signals for different flip-flops.
Another big downside is that you lose the naming of the flip-flops and you often cannot recover them, this is a major problem in peripherals where you might need the names to write constraints.


So in practice, you first to prepare the generic flip-flops in [[Yosys]] by mapping enable signals and synchronous resets into soft logic. There are some other things you need to deal with for compatibility reasons, they are explained in [https://github.com/pulp- platform/cheshire-ihp130-o/blob/basilisk-dev/target/ihp13/yosys/scripts/yosys_synthesis.tcl#L172 Basilisks synthesis script]. You probably also do not want to run it on your whole design. Instead you may want to consider splitting your design into a synchronous part where you want sequential optimization and everything else (clock domain crossings, peripherals and so on) where you do not want it. Then you run synthesis twice with either option and blackbox the stuff you do not want in this run.


----
----

Latest revision as of 08:58, 18 August 2024


Overview

In this exercise, you will learn how to go from an implementation in a hardware description language (HDL) like Verilog, SystemVerilog or VHDL to functionally equivalent gate-level netlist (a Verilog file containing only instantiating of standard cells and macros and wires connecting the cells). The cells and macros are provided by the foundry in a collection called standard cell library which contains all the gates that can be manufactured in a fully digital design for the target technology. The standard cell library together with all other technology- specific data needed to design a chip form the process design kit (PDK). In our case this is the open-source IHP130 PDK available on Github.

The process going from HDL to a netlist is called ’'Synthesis'’ and from a more abstract point of view, it is the last step of what is commonly called the ’Front-End’ of the design process.

The leading open-source synthesis tool is called Yosys. In this exercise you will learn the needed steps to perform synthesis, both from a more general perspective and also specific to Yosys. In doing so, you will be guided through the whole process of synthesizing Croc-SoC. Figure 1 gives a graphical overview of the subject of this exercise.

Design flow synthesis.png

Figure 1: Excerpt of the ASIC design flow with the subject of this exercise highlighted.


If you want to know more about how Yosys came to be and/or see the synthesis process using simple examples, you can visit Yosys’ online documentation.


About the style

We will use a number of different styles to identify different types of actions as shown below:

Student Task
Parts of the text that have a gray background, like the current paragraph, indicate steps required to complete the exercise.

Actions that require you to select a specific menu will be shown like the following: menu → sub-menu → sub-sub-menu

Whenever there is an option or a tab that can be found in the current view/menu we will use a BUTTON to indicate such an option.

Throughout the exercise you will be asked to enter certain commands using the command-line.

sh> command to be entered on the Linux command line

Some of the commands will be entered on the command line of a specific tool, and will be highlighted by a different prompt in this case ys> for Yosys.

ys> some_yosys_command

Getting Started

Student Task 1 (Setup)

Setup the working directory for this exercise by calling our install script and changing to the newly created directory:

sh> /home/efcl_004fs24/ex04_synthesis/install.sh
sh> cd ex04_synthesis/croc-soc
sh> source tardis_env.sh

Synthesis: An Overview

Before delving into Yosys synthesis, lets first look at what steps an abstract synthesis tool has to perform. You can then later on try and match the different commands ran in Yosys to one of these points.

Parse Design
The first step is to load the behavioral description, in our case SystemVerilog, into the synthesis tool. The synthesis tool needs to parse the HDL language features and convert it to an internal representation. Usually this involves first parsing the design into an abstract syntax tree (AST), a representation that closely resembles how the HDL language is structured. Then in a second step the AST is converted into a representation more suited to synthesis. In this representation we are no longer interested in how a certain operation is implemented, instead we care about what it does and how it connects to other operations. At this point the design is still behavioral (it contains code-like features which can implement something in an abstract way).
Elaborate Design
The next step is to take the behavioral design and convert it to a structural design. While a behavioral model may contain abstract things like loops or conditional branching, a structural model is very close to a netlist. The only thing it contains are cells (arithmetic operations, finite-state-machines etc) and connections between cells. Any parameters in the design (ie the size of a data-bus) can be set to their final values and the corresponding values will be propagated to the subdesigns during elaboration.
Logic Optimization
With different techniques the structural representation of design can be optimized. This means the length of the longest paths is reduced, redundant operations are removed and the output shared, constant bits are propagated and so on.
Mapping
Interwoven with the logic optimization are mapping steps. In mapping we take one or more cells and convert them into another representation. In practice this usually means we start with cells representing very high-level operations (like an arbitrary arithmetic operation) and using a mapping, we transform it to a lower-level representation (like logic gates). During the last steps of synthesis we must map the internal representation to the standard cells provided by the PDK, this is called technology mapping.

In commercial tools you would also supply a set of constraints describing the timings your design needs to meet (for example different clock nets might use a different clock period). During the whole process, the tool tries to attain the goals you set. It tries to fulfill the timing constraints you set and on all non-critical paths it is then able to reclaim logic area by finding more compact representations of the same function (which usually means they are slower).

Currently, the open-source tools do not support constraints beyond a single target frequency so we will not further explore this aspect.

Yosys Synthesis

In the previous section you have learned what steps are necessary in order to synthesize a design described in SystemVerilog. We will now go through the synthesis process step by step using Yosys.

But first, SystemVerilog is a very complex HDL and most open-source tools currently do not support all or most of its language features, see the chipsalliance test suite. Yosys currently only supports a small subset of SystemVerilog, while our PULP platform is almost entirely in SystemVerilog. To bridge this gap, we created a pipeline of three tools to stepwise go from SystemVerilog to Verilog:

Bender
Manages the dependencies (repos) and creates the necessary scripts with all file paths for further processing.
Morty
Collects the files from Bender and pickles them all into one single file and compile context (it also resolves pre-processor instructions)
SVase
Simplifies the SystemVerilog code by unrolling generate statements and propagating all parameters to their local contexts (likely the most complicated part of SystemVerilog elaboration, in our experience only Slang is able to consistently get this right; SVase builds on Slang).
SV2V
Handles the bulk of converting SystemVerilog features to Verilog.
Student Task 2

For your convenience we already ran this pipeline, you can re-run it at any point using make pickle.

  • Open pickle/croc_morty.sv and select a module with parameters (eg. module ibex_core). We recommend you just open the entire croc-soc directory in Sublime Text
sh > sublime_text .
  • Open pickle/croc_svase.sv and search for the same module, what do you observe?


  • Again in pickle/croc_morty.sv search for gen_regfile_ff. Depending on the value of a parameter a different register file (module) is used. Search for gen_regfile_ff in pickle/croc_svase.sv, what do you observe?


  • Which step of synthesis does a change like this belong to?
It is part of elaboration, hence SVase is called a ’pre-elaborator’
  • Finally, in pickle/croc_svase.sv and pickle/croc_sv2v.v search for p_encode and compare the two.

Parsing

Now let us start with the actual synthesis in Yosys.

During this exercise, you will execute a lot of commands and sometimes things might fail and you would have to redo everything. Yosys does have a command history , but usually you will want to execute a bunch of commands, so we recommend you create your own script and add the commands in there. Yosys can read two types of script files. The first just contains Yosys commands (file ending .ys) and you can run them using the command script in yosys. The second one is a tcl-script (file ending .tcl) which is able to access Yosys commands from the name space Yosys. You can run them inside Yosys using the command tcl.

Student Task 3
  • First, lets start the provided synthesis flow so we can later inspect the results if we run out of time, open a second Terminal and navigate to the same position you are currently in (pwd shows you where you are), then start the flow:
sh > make run-yosys
  • In the other Terminal, source the environment script, go into the synth directory and start Yosys
sh> source tardis_env.sh 
sh> cd synth
sh> yosys-0.35 yosys
  • First you need to load our technology data, we already prepared this for you.
ys > script scripts/init_tech.ys
  • Now you can load the design into Yosys
ys > read_verilog ../pickle/croc_sv2v.v
  • Closely read the first few lines of the Yosys output, which internal representations did Yosys just use and what could be their purpose?
  • In the following steps we will want to closely follow only one module to see what exactly Yosys does. We recommend you follow one of the delta_counter__* modules as they are relatively simple and contain some control logic, arithmetic operations and a register (flip-flops).
  • Open your selected module in one of the SystemVerilog/Verilog files in the pickle directory. We recommend pickle/croc_svase.sv since all parameters are propagated but the description is still pretty high-level.
  • What does your selected module do?
  • To quickly find your module in Yosys, you can save a selection for it using:
ys > select -set delta t:<your_module> %M
  • Where your_module is the full name of your module, so for example t:delta_counter__10356954905401074824.

This saves a selection containing only your module under the name delta. If you want to perform a command only on your module, you then use

 ys > command @delta

Print and Display

At this point you probably want to know what exactly Yosys did to your module, in fact we recommend you regularly do this in the following tasks (remember, put it in a script for easy reuse).

There are a few commands in Yosys which are useful to debug problems you have. The simplest of them is to simply write out Verilog:

ys> select @delta
ys> write_verilog -norename -noexpr -noattr -selected delta.v
ys> select -clear

Unless you know exactly what the above does, we recommend you use exactly this combination of commands and flags and just put it into some script (eg scripts/print.ys).

The next command just prints statistics including how many of which cell is in the design.

ys > stat @delta

And finally Yosys can generate schematics showing the selected network visually. This is something you should only use on smaller selections as it quickly becomes unwieldy.

ys> show -colors -width -stretch -long -prefix delta -format pdf @delta

-prefix sets the path and basename of the file it is saved to. You may want to explore the flags of this command and make it your own.

You can print the manpage of a command using:

ys > help <cmd>

Elaboration

Equipped with a nice script to print our module we can move on to elaboration.


Student Task 4
  • Explore the Verilog output of your module, which parts have already been elaborated and which haven’t?
  • Let Yosys check and clean-up the design hierarchy
ys > hierarchy -check -top croc_chip
If it can’t find certain modules, technology macros or other things, it would tell you here.
  • Now we want a fully elaborated design, convert the always blocks to a structural representation
ys > proc
  • Again explore the output using the methods described above. What changed?

High-Level optimization

You may have noticed that all the cells we currently have in our design start with a $ and the names are all lower-case. This is characteristic of the high-level (or abstract) representation in Yosys. High-level because single cells can represent large and complex operations such as division.

The most common optimization commands in Yosys start with opt_*. There is also the larger opt pass which executes a series of opt_* sub-commands. The four most important way to call opt are:

  • opt: The pass with the default settings
  • opt -noff: Same as above but does not perform any flip-flop related optimizations (integrating enable

signals, constant removal etc)

  • opt -fast: Runs an alternative sequence of sub-commands that may not be quite as good but doesn’t take as long
  • opt -full: Adds some additional optimizations

As a rough guideline the closer we get to a gate-level representation the more cells we will have in our design and the costlier the commands are. So early on you can use the more powerful commands anytime you wish but later in the synthesis process you will want to use them a bit more sparingly.

Let's run some optimizations on the current representation.


Student Task 5
  • Execute the following optimization commands
ys> opt_expr 
ys> opt_clean 
ys> opt -noff 
ys> fsm
ys> opt
Here we also have the fsm command, it finds, extracts and optimizes state-machines.
  • Lets continue optimizing
ys> wreduce 
ys> peepopt 
ys> opt_clean 
ys> opt -full
wreduce reduces the width (number of bits) of operations, peepopt performs some operator-specific optimizations and then we again perform general optimizations.

Mapping

At this point of the flow we will start mapping high-level cells to lower level implementations. Every arithmetic operation can be implemented in a number of different ways (architectures). Currently, Yosys mostly picks a good compromise and sticks with it. Choosing different architectures depending on the circumstances is one of the next big things being worked on.

Student Task 6a

The next two commands are mapping commands.

ys> booth 
ys> alumacc 
ys> share 
ys> opt

booth implements multipliers using Booth-encoding (we don’t have any real multipliers) and alumacc collects various multiply, add and subtract operations together into $alu and then $macc (multiply-add) cells, increasing resource sharing.



Student Task 6b
  • Next you need to map memories to flip-flops and then optimize unnecessary flip-flops away.
ys> memory
ys> opt -fast
ys> opt_dff -sat -nodffe -nosdff 
ys> share
ys> opt -full
ys> clean -purge

At this point we are about to use the most powerful mapping command in Yosys. techmap can take a Verilog implementation of any cell currently in Yosys and replace these cells with the implementation described in the Verilog mapping file. It has a default mapping file (which we will use) but if you are not happy with one of the architectural choices, you can use any other architecture and run it before using the default techmap mapper.

For us this means the abstraction level of our representation is about to change drastically. Currently it contains high-level concepts like arithmetic operators but after techmap they will be implemented using generic gates.

There is also another command called extract, it can match subcircuits and replace them with a custom cell you define. These two commands together can be very useful if you have some optimized implementation and want to use it in your design. There is a tutorial in the documentation.


Student Task 7
  • Before running techmap make sure you have a stat and show/write_verilog output copied somewhere so we can compare it
  • Run the following commands (this might take a few minutes)
ys > techmap
ys> opt -fast
ys > clean -purge
  • Compare the type and number of cells used from before techmap to after

In the main flow the next step would be to flatten the design. This means the module instantiations are replaced with a copy of the content of the module. You are able to keep certain modules by using setattr -set keep_hierarchy 1 module|instance .

Flattening more of the design makes it harder to debug and write constraints but it increases the potential for additional optimizations.

Currently Yosys does not perform any cross-boundary optimization (propagate information like logic functions through the instantiation of a module), this makes flattening as much as possible even more important.

Student Task 8
  • You probably want to continue follow your module so lets make sure we keep it setattr -set keep_hierarchy 1 t:your_module

Where your_module is the full name of your module, so for example t:delta_counter__10356954905401074824

  • Now lets flatten the rest
ys > flatten
ys > clean -purge
  • You can run hierarchy to make sure your module is still there (it first shows the kept modules, then the ones being removed)
  • At this point it can be a good idea to split buses on module ports into individual nets and make sure flip-flop and other cells have usable names. This is done for compatibility reasons (splitnets) and to make it easier to debug and write constraints (rename and autoname).
ys> splitnets -ports -format __v
ys> rename -wire -suffix _reg t:*DFF*
ys> autoname t:*DFF* %n
ys> clean -purge

ABC Logic Optimization & Mapping

Yosys can internally use another tool called ABC to run strong logic optimization algorithms and map to standard cells. Similarly to Yosys, ABC also requires a script describing which commands it should execute. There are a bunch of different ABC scripts out there for example the default scripts used in Yosys, the OpenROAD-flow-script area and speed oriented scripts and also the scripts generated by OpenLANE.

Today you are going to use the ABC script used to synthesize the open-source Linux capable Basilisk.

The script itself is documented in a IWLS contribution. The shortest possible description is that it uses a logic optimization technique called Lazy Man’s Synthesis, which cuts your design into small functions and then replaces them using implementations from a pre-computed table. Then followed by mapping the design to the standard cells

Student Task 9
  • Before calling ABC, you first need to map the sequential elements to standard cells
ys > dfflibmap -liberty ../ihp13/pdk/ihp-sg13g2/libs.ref sg13g2_stdcell/lib/sg13g2_stdcell_typ_1p20V_25C.lib
You can find the above path in the scripts/init_tech.ys script.

• Now we can call ABC

ys> abc -D 10000 -script scripts/abc-opt.script -liberty ../ihp13/pdk\ /ihp-sg13g2/libs.ref/sg13g2_stdcell/lib/sg13g2_stdcell_typ_1p20V_25C.lib -showtmp
ys > clean -purge
The -D #### flag sets the target period in picoseconds, this is mostly used in the final mapping to

standard cells and the buffering and resizing at the end of the ABC script.

  • Congratulations! At this point you have a netlist. Before we write it out we need to deal with constant values
ys> setundef -zero
ys> hilomap -singleton -hicell sg13g2_tiehi L_HI -locell sg13g2_tielo L_LO
ys> write_verilog -noattr -noexpr -nohex -nodec netlist.v

A last note

In the last section we simplified things a bit. You don’t need to map the flip-flops before calling ABC. If you map the flip-flops before calling ABC, it will only receive the combinational elements, even if you use abc -dff. If you want to perform sequential optimizations in ABC, you need to execute ABC before running dfflibmap but sequential synthesis currently has some major downsides/problems.

The biggest one is the following:

Even in the combinational case, ABC works strictly per-module which decreases the optimization potential since the module boundaries cannot be changed. In the sequential case this problem gets even worse because here ABC will receive each clock-domain in each module. This might seem reasonable (and it is to some extent) but the problem comes from what is considered to be a clock-domain. Every unique combination of enable, reset and clock wire going to a flip-flop is a clock domain. the clock is obvious, reset is also mostly the same among all flip-flops but in one module you may have many different enable signals for different flip-flops. Another big downside is that you lose the naming of the flip-flops and you often cannot recover them, this is a major problem in peripherals where you might need the names to write constraints.

So in practice, you first to prepare the generic flip-flops in Yosys by mapping enable signals and synchronous resets into soft logic. There are some other things you need to deal with for compatibility reasons, they are explained in platform/cheshire-ihp130-o/blob/basilisk-dev/target/ihp13/yosys/scripts/yosys_synthesis.tcl#L172 Basilisks synthesis script. You probably also do not want to run it on your whole design. Instead you may want to consider splitting your design into a synchronous part where you want sequential optimization and everything else (clock domain crossings, peripherals and so on) where you do not want it. Then you run synthesis twice with either option and blackbox the stuff you do not want in this run.




The VLSI pages are part of the open source VLSI design course offered by the Integrated Systems Laboratory of ETH Zürich, by Luca Benini and Frank K. Gürkaynak. See full list of contributors.