Skip to content

Commit

Permalink
[bitmanip] Add ZBT Instruction Group
Browse files Browse the repository at this point in the history
This commits implements the Bit Manipulateion Extension ZBT instruction
group: cmix, cmov, fsr[i] and fsl. Those are instructions depend on
three ALU operands. Completeion of these instructions takes 2 clock
cycles. Additionally, the rotation shifts rol and ror are made
multicycle instructions.

All multicycle instructions take exactly two cycles to complete.

Architectural additions:

        * Multicycle Stage Register in ID stage.
                multicycle_op_stage_reg

        * Decoder generates alu_multicycle signal, to stall pipeline

        * For all ternary instructions:
                1. cycle: connect alu operands a and b to rs1 and rs2
                          respectively
                2. cycle: connect operands a and be to rs3 and rs2
                          respectively

        * Reduce the physical size of the shifter from 64 bit to 63
                bit: 32-bit operand + 1 bit for arithmetic / one-shift

        * Make rotation shifts multicycle instructions.

Instruction Details:
        * cmov:
                1. store operand a (rs1) in stage reg.
                2. return stage reg output (rs2)  or rs3.

                if rs2 != 0 the output (rs1) is already known in the
                  first cycle. -> variable latency implementation is
                  possible.

        * cmix:
                1. store rs1 & rs2 in stage reg
                2. return stage_reg_q | (rs2 & ~rs3)

                reusing bwlogic from zbb

        * rol/ror: (here: ror)
              shift_amt       = rs2 & 31;
              shift_amt_compl = (32 - shift_amt) & 31
              1. store (rs1 >> shift_amt) in stage reg
              2. return (rs1 << shift_amt_compl) | stage_reg_q

        * fsl/fsr:
        For funnel shifts, the order of applying the shift
        amount or its complement is determined by bit [5] of
        shift_amt. Pseudocode for fsr:

              shift_amt       = rs2 & 63
              shift_amt_compl = (32 - shift_amt[4:0])

              1. if (shift_amt >= 33):
                    store (rs1 >> shift_amt_compl[4:0]) in stage reg
                 else if (shift_amt <0 && shift_amt <= 31):
                    store (rs1 << shift_amt[4:0]) in stage reg
                 else if (shift_amt == 32 || shift_amt == 0):
                    store rs1 in stage reg

              2. if (shift_amt >= 33):
                    return stage_reg_q | (rs3 << shift_amt[4:0])
                 else if (shift_amt <0 && shift_amt <= 31):
                    return stage_reg_q | (rs3 >> shift_amt_compl[4:0])
                 else if (shift_amt == 32):
                    return rs3
                 else if (shift_amt == 0):
                    return rs1

Signed-off-by: ganoam <[email protected]>
  • Loading branch information
ganoam authored and vogelpi committed Apr 16, 2020
1 parent db6f8f0 commit 4cb77b8
Show file tree
Hide file tree
Showing 12 changed files with 682 additions and 313 deletions.
4 changes: 2 additions & 2 deletions doc/instruction_decode_execute.rst
Original file line number Diff line number Diff line change
Expand Up @@ -64,9 +64,9 @@ Other blocks use the ALU for the following tasks:
* It computes memory addresses for loads and stores with a Reg + Imm calculation
* The LSU uses it to increment addresses when performing two accesses to handle an unaligned access

Support for the RISC-V Bitmanipulation Extension is enabled via the parameter ``RV32B``.
Support for the RISC-V Bitmanipulation Extension (Document Version 0.92, November 8, 2019) is enabled via the parameter ``RV32B``.
This feature is *EXPERIMENTAL* and the details of its impact are not yet documented here.
Currently only the Zbb base extension is implemented.
Currently the Zbb and Zbt sub-extensions are implemented.
All instructions are carried out in a single clock cycle.

.. _mult-div:
Expand Down
17 changes: 8 additions & 9 deletions lint/verilator_waiver.vlt
Original file line number Diff line number Diff line change
Expand Up @@ -28,17 +28,17 @@ lint_off -rule DECLFILENAME -file "*/rtl/ibex_register_file_ff.sv"
lint_off -rule DECLFILENAME -file "*/rtl/ibex_register_file_latch.sv"
lint_off -rule DECLFILENAME -file "*/rtl/ibex_register_file_fpga.sv"

// Bits of signal are not used: fetch_addr_n[0]
// Bits of signal are not used: shift_amt_compl[5]
// cleaner to write all bits even if not all are used
lint_off -rule UNUSED -file "*/rtl/ibex_if_stage.sv" -match "*'fetch_addr_n'[0]*"
lint_off -rule UNUSED -file "*/rtl/ibex_alu.sv" -match "*'shift_amt_compl'[5]*"

// Signal is not used, if RVB == 0: shift_result_ext_rvb
// Needed if RVB == 1.
lint_off -rule UNUSED -file "*/rtl/ibex_alu.sv" -match "*'shift_result_ext_rvb'*"
// Bits of signal are not used: shift_result_ext[32]
// cleaner to write all bits even if not all are used
lint_off -rule UNUSED -file "*/rtl/ibex_alu.sv" -match "*'shift_result_ext'[32]*"

// Signal is not used, if RVB == 1: shift_result_ext
// Needed if RVB == 0.
lint_off -rule UNUSED -file "*/rtl/ibex_alu.sv" -match "*'shift_result_ext'*"
// Bits of signal are not used: fetch_addr_n[0]
// cleaner to write all bits even if not all are used
lint_off -rule UNUSED -file "*/rtl/ibex_if_stage.sv" -match "*'fetch_addr_n'[0]*"

// Bits of signal are not used: alu_adder_ext_i[0]
// Bottom bit is round, not needed
Expand All @@ -49,7 +49,6 @@ lint_off -rule UNUSED -file "*/rtl/ibex_multdiv_fast.sv" -match "*'alu_adder_ext
lint_off -rule UNUSED -file "*/rtl/ibex_multdiv_fast.sv" -match "*mac_res_ext*"
lint_off -rule UNUSED -file "*/rtl/ibex_multdiv_fast.sv" -match "*mult1_res*"


// Bits of signal are not used: res_adder_h[32]
// cleaner to write all bits even if not all are used
lint_off -rule UNUSED -file "*/rtl/ibex_multdiv_fast.sv" -match "*'res_adder_h'[32]*"
Expand Down
Loading

0 comments on commit 4cb77b8

Please sign in to comment.