22C:122/55:132 Notes, Lecture 3, Spring 2001

Douglas W. Jones
University of Iowa Department of Computer Science

A Breakneck Review of Finite State Automata
When a digital system contains feedback, the result is a sequential circuit. Fundamental mode sequential circuit design, in which the entire system is viewed in these terms is very difficult and well will avoid discussing this in this course! A fundamental mode sequential circuit designer thinks in the following terms:
```
                          _______________
               ----------|               |
	inputs ----------| combinational |----------- outputs
	       ----------|   circuitry   |------
	           ------|               |---   |
	          |   ---|_______________|   |  |
	          |  |                       |  |
	          |   -----------------------   |
	           -----------------------------
	
```
The RS flipflop presented in the previous lecture was analyzed as a fundamental mode design. Fundamental mode design is used to construct optimal implementations of type JK or type D edge triggered flipflops, but most digital system designers avoid it.
Most work in digital systems is done with clocked sequential circuits. This is a higher level way of analyzing a sequential circuit, and it is only applicable to circuits that change their observable state when a clock edge appears on one of their inputs (either 0 to 1 for a leading-edge triggered clocked sequential circuit, or 1 to 0 for a trailing-edge triggered clocked sequential circuit).
Clocked mode circuits are generally built using an edge-triggered register and a combinational circuit, as illustrated in the following:
```
                          _______________
               ----------|               |
	inputs ----------| combinational |----------- outputs
	       ----------|   circuitry   |------
	           ------|               |---   |
	          |   ---|_______________|   |  |
	          |  |                     __|__|__
	          |  |                    |        | state
	          |  |    clock ----------|>       | register
	          |  |                    |________|
	          |  |                       |  |
	          |   -----------------------   |
	           -----------------------------
	
```
Here, the feedback from the output of the combinational circuitry to the input is stored in a state register. This register is made of one edge-triggered flipflop per bit of feedback, and all the flipflops share the same clock.
The design and analysis of a clocked mode sequential circuit is based on finite state automata theory, so we must review this before we continue with computer architecture.
State diagrams
The behavior of a sequential circuit can be described by a state diagram. In such a diagram, a circle represents the state of the machine, and an arc represents the change of state from one state to the next. Thus, we might diagram the behavior of a very simple system as follows:
```
                  -->--
                /       \
             (A)         (B)
                \       / 
                  --<--
	
```
This system has 2 states, A and B. Whenever a clock pulse arrives, it changes state, so on successive clock pulses, it alternates between states A and B. Note that the arcs on the state diagram must be annotated with direction arrows!
The most trivial thing you can do when you examine a state diagram is count the states. If a state diagram has n states, the implementation of the state diagram will require ciel(log₂(n)) flipflops. Thus, our example with 2 states requires one flipflop.
State Diagrams with Inputs
The above example was for an automata with no inputs. More typically, we are interested in machines that respond to their inputs. In general, if a machine has k binary inputs, these may take on 2^k distinct values. As a result, each state may have as many as 2^k distinct successor states!
The following figure illustrates a 2-state finite state machine with an input that may have 2 values, either x or y:
```
           -<-       -->--       ---
         /   x \   /  y    \   /     \
        (       (A)         (B)       )
         \     /   \    y  /   \ x   /
           ---       --<--       ->-
	
```
This system has 2 states. Whenever a clock pulse arrives, if the input is y, the state will change, but if the input is x, the state will remain unchanged. We can also describe thisd machine with a state table:
```
            input  current | next
                    state  | state
          -----------------|--------
              x       A    |   A
              y       A    |   B
              x       B    |   B
              y       B    |   A
	
```
State Diagrams with Outputs
There are two ways of dealing with the outputs of a finite state machine. One approach is to associate the output with the state. In its simplest form, the output of this type of machine is its state! Moore was the first to clearly define this type of finite state automata, so we describe such machines as Moore machines.
An alternative is to associate the output with both the input and the current state. Mealy was the first to clearly define this alternative, so we describe such machines as Mealy machines. The following figure and state table define the same Mealy machine.
```
           -<-       -->--       ---
         /  x/a\   / y/b   \   /     \
        (       (A)         (B)       )
         \     /   \   y/d /   \x/c  /
           ---       --<--       ->-
	
            input  current | next   output
                    state  | state
          -----------------|---------------
              x       A    |   A      a
              y       A    |   B      b
              x       B    |   B      c
              y       B    |   A      d
	
```
In the state diagram for a Mealy machine, each arc (state transition) is labeled with the input that enables that transition, a slash, and the output that is associated with that transition. The output is always after or under the slash!
It is a theorem of finite state automata theory that any Moore machine can be converted to a Mealy machine, and visa versa. Clearly, if a Moore machine has outputs that depend only on the state, and the machine has n distinct output values, it must have n states! As a result, translation from Mealy to Moore frequently results in adding states! The above machine is equivalent to the following Moore machine:
```
           -<-        -->--        ->-
         /   x \    /  y    \    / x   \
        (       (A1)    --<--(B1)       )
         \     / /a   /   y / /b       / 
           ---       /     /       ---   
         /          / y   /      /     \ 
        (       (A2)-->--    (B2)       )
         \   x / /d \    y  / /c \ x   /
           -<-        --<--        ->-
	
            input  current | next
                    state  | state
          -----------------|-------
              x       A1   |   A1
              y       A1   |   B1     state | output
              x       A2   |   A1    -------|--------
              y       A2   |   B1       A1  |   a
              x       B1   |   B2       A2  |   d
              y       B1   |   A2       B1  |   b
              x       B2   |   B2       B2  |   c
              y       B2   |   A2
	
```
Note that the description of a Moore machine frequently has two tables, one for the state transitions, and one giving the output associated with each state. In the above, the states that were split have retained their old names, but gained a suffix to distinguish the two Moore states resulting from each Mealy state. The output associated with each state in a Moore machine is written inside the circle for that state, under the state name, separated by a slash.

Binary Encodings

The above description of Moore and Mealy machines used symbolic names for inputs and outputs. Clearly, binary codes can be used for each. so, for example, we could code the inputs and outputs used in the above examples as:

	  input | code       output | code
	 -------|------     --------|------
	    x   |   0           a   |  00  
	    y   |   1           b   |  01  
	                        c   |  10  
	                        d   |  11

Similarly, we can encode state names:

	  Mealy |            Moore |     
	  state | code       state | code
	 -------|------     -------|------
	    A   |   0          A1  |  00  
	    B   |   1          A2  |  01  
	                       B1  |  10  
	                       B2  |  11

Having made these state assignments, we can convert the state tables and the output table previously given to give the following:

	                 Mealy

            input  current | next   output
                    state  | state
          -----------------|---------------
              0       0    |   0      00
              1       0    |   1      01
              0       1    |   1      10
              1       1    |   0      11
	

                         Moore

            input  current | next
                    state  | state
          -----------------|-------
              0       00   |   00
              1       00   |   10     state | output
              0       01   |   00    -------|--------
              1       01   |   10       00  |   00
              0       10   |   11       01  |   11
              1       10   |   01       10  |   01
              0       11   |   11       11  |   10
              1       11   |   01

Reduction to Hardware

Given the encoded state table for a Moore or Mealy machine, we can reduce these to a functional digital systems as follows. First, treat the state table and output table (if it's a Moore machine) as the specifications for Boolean functions. Build hardware to implement these functions following the outline given in the notes for the previous lecture.

Then, substitute the hardware you built into one of the following outlines:

                             _______________         ___
                  ----------|               |-------|   |---
           inputs ----------|     Mealy     |-------|   |--- outputs
                  ----------|   next-state  |-------|   |---
                            |   and output  | next  |   |
                            |    function   | state\|   |
           current    ------|               |-------|   |----- 
            state    |   ---|_______________|-------|   |--   |
                     |  |                           |   |  |  |
                     |  |         clock ------------|>__|  |  |
                     |  |                                  |  |
                     |   ----------------------------------   |
                      ----------------------------------------
    
                        ____________  next       ________
                  -----|            | state     |        |
           inputs -----|   Moore    |/ ___      | Moore  |---
                  -----| next-state |-|   |---o-| output |--- outputs
                       |  function  |-|   |-o-|-|function|---
           current  ---|            | |   | | | |________|
            state  |  -|____________| |   | | |
                   | |                |   | | |
                   | |        clock --|>__| | |
                   | |                      | |
                   |  ----------------------  |
                    --------------------------

In comparing the Moore and Mealy machines, note that the number of bits in the next-state output will usually be more for a Moore machine than for an equivalent Mealy machine, but that, where the Moore machine only needs the state register to store the current state, the Mealy machine needs to store bothe the current state and the current output.

This requirement of a register to hold the current output is imposed on Mealy machines to prevent the output from changing except when the clock edge arrives indicating that the machine should advance to the next state. If this were not done, the output of the Mealy machine could change whenever any input to the machine changes, and this is generally undesirable.

Of course, if the state assignment job is done carefully, the Moore machine may have a trivial output function, where the state itself encodes the desired output. The examples given above were deliberately clumsy in this regard!

Bit Slice Design
For both sequential and combinational circuitry, it is always possible to use the design methods described above, but it is sometimes very complex to think in such terms. For example, consider the problem of designing a 16 bit binary counter that simply increments its output modulo 2¹⁶ each time a clock pulse arrives.
It is easy to reduce this design to the following:
```
                         ----------           
	          ______|_______   |
                 |   input N    |  |
                 |              |  |
                 | output N+1   |  |
                 |______________|  |
                        |          /
                        /          |16
                        |16        |
                  ______|_______   |
                 |   16 bit     |  |
	clock -->|   register   |  |
                 |______________|  |
                        |          |     output
                         ------/---o-----
                                16        
	
```
When drawing a circuit with many parallel data lines, it is common to draw one line and then annotate it saying this is one of many by putting a slash through the line with the number of lines indicated. The slash can be thought of as a schematic symbol for a string tied around the bundle of lines.
The trouble with the above design is that the combinational function required has 16 inputs and 16 outputs. As a result, the truth table for this function has 2¹⁶ lines, and writing it out takes an inordinate amount of time and paper!
Fortunately, this function has a very regular structure, so it is possible to simplify the design immensely by exploiting this structure. When we exploit the regularity in an n-bit function by dividing it into n sub-functions operating on one bit each, possibly with additional horizontal interconnections between these 1-bit sub-functions, we call the result a bit-slice realization of the target function.
In the case of the 16 bit increment function, the bit slice design requires sub-functions with the following structure:
```
         slice 15  ...   slice 1   slice 0

            A      ...      A         A
             15              1         0
	     |    C         |    C    |     0
	     |  __ 15       |  __ 1   |  ___|
	   __|_|  |       __|_|  |  __|_|   
	  |  a c|        |  a c| | |  a c|   
	  |  +  |     C  |  +  | | |  +  |
	  |c_s__|      2 |c_s__| | |c_s__|
	 __| |         |__| |    |__| |
	     |              |         |
	    S      ...      S         S
             15              1         0

	
```
In this case, the function of the smaller units is obvious from an elementary school understanding of arithmetic. The connections between the units are called carry signals C₁ to C₁₅, with a carry out signal from each unit to the next more significant unit, and with C₀, the carry in to the least significant unit set to 0. Unit i adds the C_i to D_i to produce a 2-bit sum, delivering the least significant bit of the sum as S_i to the output and the most significant bit of the sum as C_i+1 to the next place.
The bit-slice design for this counter can be completed by adding one bit of the register to each of the bit-slices of the combinational function, giving the following structure for each bit-slice of the original counter:
```
               C
                i
	      -|----------
	     | |          |
	   __|_|          |    in  +  out
	  |  a c|         |   a  c | c  s
	  |  +  |         |  ------|------
	  |c_s__|  ____   |   0  0 | 0  0
	   | |    |    |  |   0  1 | 0  1
	   |  ----|D  Q|--o   1  0 | 0  1
	   |      |    |  |   1  1 | 1  0
	 --|------|>C  |  |
	   |      |____|  |
	   C              S
            i+1            i
	
```
The implementation of this counter stage can be optimized, but that isn't the point of this course! The speed of this counter is limited by the time taken by the combinatorial circuitry to propagate information from the C_i input to the C_i+1 output. This is a problem that has been addressed by the designers of digital systems since the mid 19th century. Charles Babbage, for example, put a considerable amount of effort into what he referred to as carry anticipation mechanisms in his later mechanical difference engines, and today, essentially all high performance computers rely on carry anticipation trees in their arithmetic logic units.
This approach to designing systems that operate on long words is quite common, and it has also been used to manufacture computers. Bit-slice computer systems were built in the late 1970's with one slice of the entire system on each on N circuit boards, for an N-bit computer. By the mid 1970's, bit-slice CPU and ALU chips were commonly available. Nowdays, it is reasonable to use bit-slice design for many components found on VLSI implementations of computers.