Today, all SOCs use scan structures to detect any manufacturing defects in the design. The scan chain is designed for testing and connects the sequential elements of the chip in series. These scan chains are prone to holdover failures due to the lack of combinatorial logic between the scan elements. In addition to using less than 90nm technology, OCV (On-Chip Process Variation) has a huge impact on timing slack. Therefore, unless the design implements timing sign-off at multiple corners, holdover failures are highly likely, especially on hold-critical paths such as scan chains. These holdover failures can render the chip unusable in real-world applications (even if the chip is fully functional in a functional scenario). If these faults occur in the chip, it will reduce the yield and affect the yield, resulting in huge economic losses for the design company. Therefore, we need to design a robust scanning structure to solve the above problems.
In this article, we’ll start by quickly reviewing the basic concepts of latch versus flip-flop timing. In the next section, we will introduce scan chains and the timing closure issues associated with them. We then explain how to use latches and flip-flops in the scan chain to create robust scan structures to avoid timing glitches in technologies smaller than 90 nanometers. We will describe the best solution to meet the timing requirements for all possible combinations of timing elements in the scan chain.
Setup/Hold Timing Overview
Flip-flops and latches are the two basic building blocks of sequential circuits. A flip-flop changes its state on the active edge (positive or negative) of the applied clock pulse. A flip-flop only holds its output when there is no active clock edge. Latches, on the other hand, are level sensitive devices that constantly sample their inputs and change their output accordingly on the active pulse level (positive or negative) of some level enable signal. The flip-flops are in a master-slave configuration with two latches operating in cascade at active levels opposite each other. A flip-flop has almost twice the area of a latch.
In order to implement a synchronous design, we need to ensure that the output of the flip-flop/latch is not in a metastable state. This can be ensured by meeting setup and hold check requirements in the design.
In flip-flops, 1-1 are hold checks, while 1-3 are setup checks for single-cycle operation (Figure 1). We need to ensure that the data emitted by flip-flop 1 is captured by flip-flop 2 before the next active edge. At the same time, we also need to make sure that the data emitted by flip-flop 1 is not captured by flip-flop 2 on the same active edge.
After the second flip-flop is triggered by a negative edge, the setup check will be 1-2 (see Figure 2), while the hold check will occur on the previous negative edge (see Figure 2). This means that the data emitted by flip-flop 1 should not be captured by the falling edge of flip-flop 2 before. This cannot be achieved in a real-time manner unless we have a clock skew of more than half a cycle.
Therefore, in a positive-positive or negative-negative flip-flop pair, the setup check defaults to one cycle, and the hold check defaults to zero cycle, while in a positive-negative or negative-positive flip-flop pair, the setup check defaults to half a cycle, And keep checking for the reverse half cycle. Let us now understand the concept of timing checking in latches.
Scan chains are used to perform tests in the SOC. All registers in the design are connected in series, the external chip provides stimuli, and the outputs of these chains are then read out to monitor for stuck/state transition faults. Today’s SOCs are very complex and have multiple clock domains in a single chip. Although scans will splice a design after logic synthesis, care should generally be taken to splice flip-flops with the same clock structure in the same scan chain. However, since the scan input/output ports available at the highest level are limited, mixing registers between different clock domains is unavoidable. Having unbalanced lengths for the scan chains is also not the best solution, since it increases the overall test time. Therefore, this design structure can lead to timing closure problems in later design stages. Because the sweep displacement is done at low frequencies and minimal, if any, logic is required between flip-flop pairs, establishing closure will not be a problem. However, these paths are critical hold paths because of the minimal offset that occurs between the logic and flip-flop pairs. As we discussed earlier, because flip-flops from different domains are mixed in the scan chain, in many cases there is a huge skew between firing and capturing flip-flops. In the later stages of the design, many hold time violations can occur due to the effects of noise, which can lead to hold buffers in both stable and closed designs, leading to design failures.
Worse, our derating margin may not be sufficient and we can only find holdover failures from the silicon. This can happen if the abnormal clock path is very large and the actual deviation on the silicon is higher than expected. As we move further into sub-90nm CMOS technology, bias effects will become more and more dominant and will result in many holdover biases on silicon. Hold failures in the scan shift path can have serious consequences. Multiple debugs are required and many hours are required to detect fault chains on silicon. This situation gets worse when we also have compression logic for scanning. Even if a faulty chain is detected, we need to block it, which will result in reduced test coverage.
In conclusion, the risk of holdover failures in the scan chain is high, and a sufficiently robust design must be implemented to handle these uncertainties.
There are various workarounds, such as reordering the scan chain, rearranging the scan chain according to the location of the registers. Although these techniques are readily available, designers must explore them carefully, and as we discussed earlier, it is unavoidable that the scan chain crosses between two clock domains.
A more efficient way to deal with this problem is to take action ahead of time and deal with them during the logic synthesis stage of building the scan chain. All flip-flops from the same clock gating logic should be stitched together, and a locked latch can be inserted at the end of these flip-flop bundles to avoid going from the last flip-flop of this domain to the first of the next clock domain. Any hold failure between a trigger.
The example shown in Figure 3 will help us understand this concept.
If the clock period is 50ns and the skew is 5ns, we must insert a holding buffer with a derating margin equivalent to more than 5ns between flip-flop 3 and flip-flop 4 later in the design. As discussed earlier, with less than ocv in 90nm designs, our standard derating may become insufficient due to anomalous clock paths that exceed certain limits. For example, for a capture path with 10 additional clock buffers, each clock buffer with only 5ps skew (over and over derating) would result in a 50ps skew. Also, due to the OCV factor. This offset may exceed 5ns, and this margin may not be sufficient.
The solution to the above problem is to insert a latched latch in the output of flip-flop 3, while making the latched latch have the same delay as flip-flop 3.
lockup latch: lock latch; clock gating: gate control
zero cycle check hold, easy to meet: zero cycle check hold, easy to meet;
Shifting of data from flop 3 to 4 is still in one shift cycle: The shifting of data from flop 3 to 4 is still in one shift cycle.
Hold check is half cycle back now, much relaxed now: Hold check is half cycle back now, much relaxed now:
As shown in the waveform above (Figure 4), when we insert the latch between flip-flop 3 and flip-flop 4, our timing path will be divided into two stages.
1. From flip-flop 3 to locking latch
The keep check starts at 1-1, it’s still a zero cycle check, but since there is no offset, it’s very simple and easy to do. Default build checks start at 1-2.
2. From Latch Latch to Flip Flop 4
Keep checking from 2-1. This is the main advantage and motivation for inserting a locking latch. Hold is a half-cycle backward shift, and now even though our clock is skewed by up to half a shift clock cycle, we still have enough headroom. This ensures that there won’t be any hold bias in this case.
The build check starts at 2-3. The latch is transparent during periods 2-3, any data captured during this phase will be transferred to flip-flop 4 until edge 3 (minus the setup time of the flip-flop). We can see that the setup check from flip-flop 1 to the lock latch can also be done easily. 1-2 are the default checks, but the latches are transparent during the entire half cycle, and in an ideal case the setup check can be shifted towards edge 3. (This concept is called latch borrowing).
Another important thing to note here is that the locking latch should have the same clock as the issue flip-flop clock, not the capture flip-flop clock. As we can see above, flip-flop 3 to latch hold check is still 1-1 (zero cycle check). We won’t get any advantage if the locking latch has the same clock as the capture flip-flop clock. Therefore, the ideal solution is to have both the issue flip-flop and the locking latch driven by the same clock buffer in the clock tree structure.
The above examples illustrate that latches can be effectively mixed-hold in the scan shift path. One might ask whether we can also fix these deviations by inserting hold buffers or delay cells. However, a quick look at hold buffer area, delay cells and latches shows that hold buffers are suitable for mixing small hold deviations, but if the deviation is large, latches are better than buffers in both area and delay have more advantages. When using delayed cells, there is always a huge risk of deviation between different operating conditions, so these cells should be used selectively and wisely. Latches, on the other hand, always have a half-cycle delay under any operating condition.
In the last section, we will consider various cases to find the most suitable solution to fix hold failures when there is a huge clock skew between the issue and capture flip-flops of the scan chain.
Case 1: Between positive-positive edge-triggered flip-flops
We included this case in the above example, a negative level latch can be used.
Case 2: Between negative-negative edge triggered flip-flops
With the same simulation as above, a positive level latch can be used.
Case 3: Between negative-positive edge-triggered flip-flops
We’ve learned how easy it is to stay here. No locking element is required here.
Case 4: Between positive-negative edge-triggered flip-flops
This is a very interesting situation. From a timing point of view, this situation does not pose a problem, but in scan shift it is an illegal connection. Since in ATPG the clock is seen as returning to the zero waveform (after the shift is complete, the clock will go active low), if we allow this crossover we will find that after the scan shift, all such The positive-negative pair will have the same value after the clock pulse. This will result in reduced test coverage as all triggers are not independently controllable. This should be avoided when splicing, but sometimes it cannot be avoided because of compression logic or hard macros.
We can insert a positive level locking latch between the positive and negative flip-flops, this will solve the ATPG problem, but it will also introduce timing problems, because the keep checking is from the flip-flop to the locking latch and from the latch The period from the fuse to the negative edge flip-flop will again become a zero cycle check.
Another solution would be to insert a dummy flip-flop that can work on positive or negative edges between those flip-flops. It should be noted that the dummy flip-flop will still have the same value as the first or second flip-flop after shifting, depending on whether it is triggered on a positive or negative edge, but this does not cause any issues, so it’s not a functional trigger and we don’t use it anywhere to capture data in any way. If we decide to insert a positive edge flip-flop, the clock delay to issue the flip-flop and the dummy flip-flop will be the same as it will be a zero-cycle hold check, and the dummy flip-flop to the next flip-flop will be a half-cycle hold check, again, If we insert a dummy negative edge flip-flop, the delay of the capture flip-flop and the dummy negative edge flip-flop is the same.
These are all four cases between flip-flops that can exist in a design, but sometimes these cases are not very obvious. For example, special attention is required when scanning designs that have hard macros and are pre-stitched. In many cases we don’t have netlist/spef/timing constraints for hard macros, so we recommend inserting locking latches before these hard macros in case the owner of these hard macros loses them. Another such example is burn-in mode, where scan chains in a design are connected together so that all flip-flops are toggled at the same time. So there is also the possibility that the last element in the chain and the first element of the next chain have timing critical logic or invalid positive-negative crossings. Ideally, attention should be paid to the RTL itself for this situation, as the designer has a better understanding of the order of the scan elements when connecting these scan chains together. If this is not taken into account, the best practice is to insert the corresponding locking latch at the end of each chain.
By employing the above tips and guidelines, designers can implement robust scanning structures on their chips. In the event of a setup failure, the design can operate at a lower frequency, but the intended function of the logic is unpredictable in the presence of any major holdover failure. The hold failure in the scan displacement is very serious. It greatly reduces test coverage during testing. Therefore, we need a robust scan structure that addresses the potential scan shift failure problem we discussed earlier. A corresponding locking type element can perfectly solve such problems, as it guarantees a half-cycle delay under any operating conditions.