# Improving Test Effectiveness of Scan-Based BIST by Scan Chain Partitioning

Dong Xiang, Senior Member, IEEE, Ming-Jing Chen, Jia-Guang Sun, and Hideo Fujiwara, Fellow, IEEE

Abstract—Test effectiveness of a test-per-scan built-in self-test (BIST) scheme is highly dependent on the length and number of scan chains. Fewer cycles are used to capture test responses when the length of the scan chains increases and the total number of clock cycles is fixed. Another important feature of the test-per-scan BIST scheme is that test responses of the circuit at the inputs of the scan flip-flops are not observable during the shift cycles. A new scan architecture is proposed to make a scan-based circuit more observable. The scan chain is partitioned into multiple segments. Multiple capture cycles are inserted to receive test responses during the shift cycles compared to the test-per-scan test scheme. Unlike other BIST schemes using multiple capture cycles after the shift cycles, our method inserts multiple capture cycles inside the shift cycles, but not after the shift cycles. Unlike the previous method that drives multiple scan segments by a single scan-in signal, the proposed method uses a new architecture to control all scan segments by different signals. Sufficient experimental results are presented to demonstrate the effectiveness of the method.

Index Terms—Scan-based built-in self-test (BIST), scan chain partitioning, test-per-clock, test-per-scan.

# I. INTRODUCTION

**S** CAN-BASED built-in self-test (BIST) has attracted much attention for several decades [1]–[3]. Scan-based BIST can be simply classified into two types: test-per-scan and test-per-clock [1]. In test-per-clock BIST, a test vector is applied and its test responses are captured and compressed at every clock cycle. This test scheme usually needs fewer test vectors to reach the same fault coverage compared with the test-per-scan BIST. However, it is required that all scan chains should capture test responses like a multiple input signature register (MISR). That is, an XOR gate is inserted for each scan flip-flop. Furthermore, the XOR gates are placed on the functional paths. The area overhead and timing overhead caused by this test scheme is unacceptable in most cases. The BILBO [17] is an example of the test-per-clock BIST scheme, and the circular BIST is another example.

M.-J. Chen is with Institute of Microelectronics, Tsinghua University, Beijing 100084, China.

H. Fujiwara is with the Graduate School of Information Science, Nara Institute of Science and Technology, Ikoma, Nara 630-0101, Japan.

Digital Object Identifier 10.1109/TCAD.2005.847943

In test-per-scan BIST scheme, a test vector is first shifted into the scan chains, and a functional cycle is adopted to capture test responses after the shift cycles. The test responses captured in the scan flip-flops are shifted out when the next test vector is scanned in. The test-application time of the test-perscan scheme is more than that of the test-per-clock scheme. The input signals of all scan flip-flops are not observed during the shift cycles in case of the test-per-scan BIST scheme. Partial scan BIST (PSBIST) [19] combines partial scan design with pseudorandom testing, which is a tradeoff between both BIST schemes. Most of the current test-per-scan BIST schemes are based on the stumps parallel scan architecture [2].

Test length of scan-based BIST is usually determined by the hard-to-test faults. Test length reduction of the hard-to-test faults is an important issue. Various techniques are used to handling the problem.

- A combination of the deterministic test patterns of the hard-to-test faults with random patterns [14], where all deterministic tests are stored in the system before testing. Test data compression is an important issue for these techniques.
- Weighted test pattern generation [13], [20] that applies weighted tests to the primary inputs. However, only application of weighted patterns to primary inputs may not solve the problem completely.
- 3) Test point insertion that can make a lot of random resistant faults testable [7], [19], [23]–[25].

Recently, Tsai et al. [25] proposed a novel BIST scheme, which selected a small number of nonscan flip-flops in the circuit. This scheme can make some scan flip-flops with bad observability more observable in all clock cycles, and therefore, enhances the test effectiveness of scan-based BIST. The method also inserts multiple capture cycles after the shift cycles within a test cycle in order to improve fault coverage. An improved method of the above paper was presented recently in [12], which can increase the proportion of at-speed test and enhance the test effectiveness of scan-based BIST further. A broadcast BIST scheme was proposed by Kim and Vinnakota [16], which can transfer test data to multiple scan flip-flops in one cycle through a single broadcast line. The Illinois scan architecture [9] partitions a scan chain into multiple scan segments, where all segments are driven by the same scan-in signal. The scheme can effectively decrease test-application cost. However, all scan segments may receive interdependent test signals, which may cause nontrivial degradation on performance of test generation if the scan flip-flops are not grouped properly. Lee et al. [18] presented a novel test-application scheme by supporting multiple circuits with a single scan-in signal in a core-based system, which can

Manuscript received April 14, 2003; revised September 25, 2003, January 5, 2004, and May 6, 2004. This work was supported in part by the National 973 Project of China under Grant 2004BC719400, in part by the JSPS under Grant L03540, and in part by the National Science Foundation of China under Grant 60373009 and Grant 60425203. This paper was recommended by Associate Editor K. Chakrabarty.

D. Xiang and J.-G. Sun are with the School of Software, Tsinghua University, Beijing 100084, China (e-mail: dxiang@tsinghua.edu.cn).

reduce test-application time drastically without any degradation on test efficiency.

The number of scan chains and their length influence the test effectiveness of BIST greatly. A well designed phase shifter (PS) can break the signal interdependence effectively when the number of scan chains is very large. This scheme may make the size of the MISR very large when the number of scan chains is too large. Observability of the circuit is likely too poor if the length of the scan chains is too large. It is necessary to make length of the scan chains well-controlled. Kim, Ha, and Tront [15] proposed the novel idea of using signature analyzers to generate tests and analyze test responses simultaneously. However, all XOR gates related to the signature analyzer must be inserted on the functional paths as it is in a BILBO [17].

Test point insertion [7], [19], [23]–[25] is another effective technique to improve testability of hard-to-test faults. Cheng and Lin [7] proposed a timing-driven test point insertion method using the testability gain function estimation of Lisanke et al. [20], which avoids inserting test points into the critical paths. Tamaramalli and Rajski [23] proposed a multiple phase test point insertion method by effectively using the updated testability information of the circuit after some test points have been inserted. PSBIST [19] proposed a combined method by inserting test points into the partially scanned circuits. New testability estimation method was proposed in [19] for circuits with only self cycles. However, all the above methods connected extra pins of the control test points with the PRTG or the outputs of the PS, which may cause much area overhead. It may not be so easy to assign sensitization values to the extra pins of the control points when the circuit is set as the operational mode if extra pins of the control points are connected with separate stages of the PS directly [7], [19], [23], [25]. The previous test point insertion methods [7], [19], [23], [25] also connected the observation points with the signature analyzer, which makes the size of an MISR larger. Touba and McCluskey [22] proposed a test point insertion method for BIST based on a path tracing scheme, where extra inputs of the control points were connected with the primary inputs.

This paper presents a new scan architecture for scan-based BIST. Our method partitions a scan chain into multiple segments by inserting an XOR gate between two adjacent scan segments, where all these XOR gates are not inserted on the functional paths. Therefore, unlike the BILBO [17] and the technique proposed by Kim *et al.* [15], the XOR gates do not incur any timing overhead. Test responses of each scan segment can be observed through an XOR tree. Therefore, multiple capture cycles are inserted during one test cycle, which usually improves the test effectiveness greatly. This scheme is completely different from the methods that insert multiple capture cycles after the shift cycles.

In the rest of the paper, preliminaries of the paper are presented in Section II. The new scan architecture is presented in Section III. A new testability estimation scheme for the proposed scan architecture is introduced in Section IV. Selection of the number of scan chains and the number of scan segments for each scan chain is presented in Section V. A new test point structure is presented in Section VI. Test point selection and connection with primary inputs and pseudoprimary inputs (PPI) are also presented in Section VI. Sufficient experimental results are presented in Section VII. This paper is concluded in Section VIII.

## **II. PRELIMINARIES**

First, we present several necessary definitions. A scan cycle is the period in which a test pattern is shifted into (or test responses are shifted out of) the scan chains. The length of a scan cycle (the number of clock cycles) is equal to the number of scan flip-flops in the longest scan chain. A capture cycle is the period between two adjacent scan cycles. The circuit is set to the normal mode during the period when the test pattern is applied to the circuit and the test responses are captured in the scan flip-flops. Usually, a capture cycle is a single clock cycle, however, the test schemes proposed in [12] and [25] apply multiple capture cycles. That is, the circuit is set to the normal mode for multiple clock cycles after the shift cycles. A *test cycle* consists of a scan cycle followed by a capture cycle.

The *i*-controllability  $C_i(l)$   $(i \in \{0, 1\})$  of a node l is defined as the probability of justifying i at node l by a randomly selected input vector. The observability O(l) is defined as the probability of propagating the value of l by a randomly selected input vector to a primary output. The detection probability pd(l/i) of fault l/i is defined as the probability to detect the fault l/i (lstuck-at  $i, i \in \{0, 1\})$  by a randomly selected input vector. The detectability  $D_n(l/i)$  is defined as the probability for the fault l/i to be detected by the first n randomly selected input vectors.

A number of the previous scan-based methods [7], [19], [23]–[25] used the stumps [2] scan-based BIST architecture as shown in Fig. 1. The outputs of the pseudorandom test generator (PRTG) are connected with a PS. Each of the extra pins of control test points is directly connected with one bit of the outputs of the PS. Therefore, the number of control test points cannot be large enough. This may make the method unable to get good enough testability [5], [6] in many cases. Each scan chain is also connected with one bit of the PS. Each output of the scan chains is connected with one bit of the MISR. The observation test points are also connected with inputs of the MISR.

A new scan-based architecture is presented in Fig. 2, where the control points are connected with the outputs of the PRTG or PPIs. Connection of a control point with an output of the PRTG or a PPI generates no new reconvergent fanouts in the combinational part of the circuit. Furthermore, more than one control test points can be connected with the same output of the PRTG like the techniques in [26] and [28]. Similarly, the extra inputs of control points can also be connected with any PPIs, that is, outputs of scan flip-flops. The structure of the control test points are similar to those in [26], [28]. The number of scan chains is still the same as that in Fig. 1. Each scan chain is partitioned into multiple scan segments, and an XOR gate is inserted between two adjacent scan segments. Outputs of all scan segments are connected to an exclusive-or tree. Our method does not insert any other extra observation points to improve testability. However, the extra connections from the outputs of the scan segments with the XOR gates are similar to extra observation points, which can effectively improve observability of the circuit. As shown in Fig. 2, each scan chain is partitioned into a



Fig. 1. Previous scan-based BIST architecture.



Fig. 2. General scan architecture of the proposed scan-based BIST method.

couple of segments. A capture cycle follows a number of shift cycles, whose number is equal to the number of scan flip-flops in the longest scan segment. The proposed scan-based BIST architecture can capture test responses much more frequently, and therefore, can substantially reduce test-application cost.

It should be noted that no PS is utilized in the new scan architecture as shown in Fig. 2. The proposed scan architecture can be thought of as a PS with a new test generation scheme. The first scan segment of each scan chain receives test signals from the PRTG directly, and the scan segments after the first one get test signals by an exclusive-or of the same signal and the test response captured at the preceding scan segment for the previous test. It will be shown that properly grouping of scan flip-flops can make test signals feed to all scan segments independent.

#### **III. NEW SCAN ARCHITECTURE**

The test effectiveness of scan-based BIST can be improved by using scan chain partitioning. However, it is necessary to avoid signal correlations among different scan segments. A scan chain partitioning scheme and a scan flip-flop grouping scheme are presented in Sections III-A and III-B, respectively.

## A. Scan Chain Partitioning

A scan chain partitioning scheme is introduced to optimize the number and the length of scan chains. As shown in Fig. 3, each of the scan chains is partitioned into multiple segments, where an XOR gate is inserted between two adjacent segments. One input of the XOR gate is connected with the scan-in signal of the scan chain, and the other is connected with the output of the preceding segment. Let the length of a scan segment be n. The input signal of each scan segment except the first one is the exclusive-or of the value of the scan-in signal and the value of the output of the previous scan segment. Outputs of all scan segments are connected with an XOR tree, whose output is connected with one stage of the MISR.

The test scheme is also changed as follows: 1) all scan flipflops are set to the test mode, 2) n shift cycles are to load test signals, where n is the length of the scan segments; and 3) the scan flip-flops are all set to the functional mode to receive the test responses. This is quite different from the test scheme of the original scan architecture that inserts one capture cycle after N shift cycles (N is the length of the original scan chains). Therefore, the circuit becomes more observable, and more test responses



Fig. 3. Multiple XOR trees for multiple scan-segment scan chain. (a) Scan segment partitioning with multiple XOR trees. (b) Scan segment partitioning with one XOR tree.



Fig. 4. Test scheme of the partitioned scan chain: (a) Normal scan-based BIST scheme, (b) multiple capture cycles after the scan-in cycles, and (c) partitioned scan chain test scheme.

are received. The stage number of the PRTG and the size of the MISR are the same as the original ones. This technique can improve the test effectiveness of the scan-based BIST scheme in most cases.

Multiple XOR trees can be used if a scan chain is partitioned into many segments because the connection of the outputs of many scan segments to a single XOR tree can cause aliased faults. As shown in Fig. 3(a), a scan chain is partitioned into multiple scan segments, the outputs of which are connected with two XOR trees. Fig. 3(b) presents the structure to connect the outputs of all scan segments with a single XOR tree. The multiple XOR tree scheme requires a larger size MISR. A single XOR tree is enough for all benchmarks we considered in our experiments. The scan architecture presented above may need a number of extra connections that introduce extra routing overhead. However, the extra routing overhead are within reasonable range in most cases.

As presented in Fig. 4(a), the normal scan-based BIST scheme inserts one capture cycle after a number of shift cycles. After the test vector is shifted into the scan chains, all scan flip-flops turn to the functional mode, where all scan flip-flops receive the test responses from the combinational part of the circuit. The number of shift cycles is equal to the length of the longest scan chain based on the *stumps* parallel scan archi-

tecture. Tsai *et al.* [25] and Huang *et al.* [12] proposed novel testing schemes by adding multiple capture cycles as shown in Fig. 4(b) after the shift cycles, which can improve the test effectiveness in most cases. Fig. 4(c) presents the proposed test scheme, which partition a scan cycle into a number of scan cycles. Each scan cycle of the proposed test scheme needs a small number of shift cycles (the number of shift cycles should be less than the length of the scan chains), followed a capture cycle. The test scheme is very likely to improve the test effectiveness.

## B. Scan Flip-Flop Grouping

As stated above, a scan-in signal drives a number of scan segments by an exclusive-or of the scan-in signal and the test responses captured in the preceding scan segment. This may produce some extra signal correlation and some additional redundant faults in some cases if the scan flip-flops are not grouped properly. Outputs of all scan segments are connected with an XOR tree, which may cause some aliased faults. The scan flip-flop grouping scheme also influences testability of the circuit. Our method tries to group scan flip-flops to avoid the aforementioned negative impacts while driving multiple scan segments with the same scan-in signal and connecting outputs of multiple scan segments to the same XOR tree.



Fig. 5. Scan flip-flop grouping.

As shown in Fig. 5, let a scan chain be partitioned into k segments:  $(v_{1,1}, v_{1,2}, \ldots, v_{1,n}), (v_{2,1}, v_{2,2}, \ldots, v_{2,n}), \ldots,$  $(v_{k,1}, v_{k,2}, \ldots, v_{k,n})$ , where each scan flip-flop  $v_{i,j}$  is the *i*th scan flip-flop of the *i*th scan segment. Considering the partitioned scan chain as shown in Fig. 5, we have the following scan flip-flop groups:  $\{v_{1,1}, v_{2,1}, \ldots, v_{k,1}\},\$  $\{v_{1,2}, v_{2,2}, \dots, v_{k,2}\}, \dots, \{v_{1,n}, v_{2,n}, \dots, v_{k,n}\}.$  The following conditions should be met to group scan flip-flops properly: 1) All scan flip-flops in each of the above groups do not converge in the combinational part of the circuit and 2) all scan flip-flops in each of the above groups do not have any common predecessor in the combinational part of the circuit. Merging two scan flip-flops does not generate any new reconvergent fanout if they do not have any common successor in the combinational part of the circuit. Therefore, the test data fed to the scan segments of the same scan chain should have no correlation. This mainly benefits from the single stuck-at fault model that we are considering in our paper. It is also good for other fault models, such as, transition faults, path delay faults and multiple stuck-at faults. Any pair of scan flip-flops in each group do not have any common predecessor in the combinational part of the circuit in order to avoid aliasing. The test responses lose no information because of the XOR trees if all scan flip-flops in each of the above scan flip-flop groups do not have any common predecessors. Therefore, no degradation on fault coverage generates if all the scan flip-flop groups meet the above conditions.

In order to group the scan flip-flops properly, two tables C[i, j] and P[i, j] are prepared first. Here, C[i, j] = 1 represents that scan flip-flops i and j have a common successor in the combinational part of the circuit, and P[i, j] = 1 stands for the fact that scan flip-flops i and j have a common predecessor in the combinational part of the circuit. It is not difficult to obtain these tables. Scan flip-flop grouping is completed based on these two tables.

Let us consider scan flip-flop grouping for a scan chain with k scan segments. Scan flip-flops are put into the first scan segment without any constraint. Scan flip-flops that converge or have common predecessors in the combinational part can be put into a scan segment if necessary. When constructing other scan segments, the two conditions given earlier in this section should

be satisfied. Further scan flip-flop grouping details can be found in [27] and the related work.

As for circuits with enough scan flip-flops, it is not difficult to form the scan flip-flop groups with the above properties when the number of scan segments is not very large. The scan flip-flops grouped as stated above do not cause any signal correlations even though the scan segments are driven by the same signal directly. Our method tries to meet the first condition for the scan flip-flops close to the input of the scan segments if it is impossible to satisfy both conditions simultaneously. The reason is that the signals assigned to the scan flip-flops close to the outputs of the scan segments have less impact on the effectiveness of the whole test process. The proposed method also tries to meet the second condition for the scan flip-flops close to the outputs of all scan segments when both conditions cannot be satisfied together. The reason is that fault effects captured at the scan flip-flops close to the input of each scan segment can also be propagated to PPIs of all succeeding scan flip-flops in the same scan segment, making it more likely for the fault effects captured at those scan flip-flops to be observed through the combinational part of the circuit.

Another scheme to deal with the problem when both conditions cannot be satisfied is to use multiple XOR trees as shown in Fig. 3(a). This can make it easy to satisfy the second condition.

# IV. TESTABILITY ESTIMATION FOR THE PROPOSED SCAN-BASED BIST SCHEME

Initially, values at the PPIs of scan flip-flops are unknown. It is reasonable to set both 0-controllability and 1-controllability of the lines to 0.5 at this point. All nodes in the circuit can be assigned deterministic values after a small number of vectors are applied. As for the conventional *stumps*-based BIST architecture, all nodes can get specified values after only one test cycle.

#### A. Bases for Testability Estimation of the Method

As for two terms  $t_1$  and  $t_2$  of a function, we call  $t_1$  and  $t_2$  exclusive if  $t_1 \cdot t_2 = 0$ .

Lemma 1: Let  $f = t_1 + t_2 + \cdots + t_k$ , and any pair of terms of f be exclusive. We have

$$C_1(f) = C_1(t_1) + C_1(t_2) + \dots + C_1(t_k).$$
(1)

|        |     |      | test cycle | clock cycle<br>j | C 1( PI) | C 1(PPI)<br>the <i>i</i> th sff                                                        | O(PO) | O( PPO)                            |
|--------|-----|------|------------|------------------|----------|----------------------------------------------------------------------------------------|-------|------------------------------------|
|        |     |      | 1          | 0 < j < i        | 0.5      | 0.5                                                                                    | 1.0   | 0.0                                |
|        |     |      |            | (i-1) < j < n    | 0.5      | 0.5                                                                                    | 1.0   | 0.0                                |
| PIs    |     | POs  |            | j = n            | 0.5      | 0.5                                                                                    | 1.0   | O(PPI) in the next clock cycle     |
| C PPIs | CUT |      |            | j = (n+1)        | 0.5      | C 1 (PPO) in<br>the <i>n</i> th cycle                                                  | 1.0   | 1.0                                |
|        |     | PPOs | t ( t>1 )  | 0 < j < i        | 0.5      | C 1 (PPO) of<br>the $(i-j)$ th sff<br>at the capture<br>cycle of $(t-1)$<br>test cycle | 1.0   | 0.0                                |
|        |     |      |            | (i-1) < j < n    | 0.5      | 0.5                                                                                    | 1.0   | 0.0                                |
|        |     |      |            | j = n            | 0.5      | 0.5                                                                                    | 1.0   | O( PPI )in the<br>next clock cycle |
| FFs    |     |      |            | j = (n+1)        | 0.5      | C 1 (PPO) in<br>the <i>n</i> th cycle                                                  | 1.0   | 1.0                                |

Fig. 6. Testability estimation for the proposed scan architecture.

It is clear that any assignment of the circuit cannot set any pair of the terms to value 1 simultaneously.

Let B be the output of the preceding scan segment, A be the scan-in signal, y be the input of the next scan segment. We have  $y = A \oplus B$ . It is clear that y is set as unknown if B is unknown. However, it is uncertain that signals of all PPIs are also completely random as we understood previously in the process of a test cycle. This will be further explained in the rest of this section. Input signals of the first scan segments for all scan chains as shown in Fig. 2 are random, which are driven by the PRTG directly. In the following lemma, we prove that randomness of the input signal of all internal scan segments is very good.

Lemma 2: Input signals of all scan segments are random.

**Proof:** It is clear that first segments are driven by the random signals directly. We would like to prove the 1-controllability of all input signals of the internal scan segments is still 0.5 no matter whether all nodes have been assigned specified values or not. We have  $y = A \oplus B = A \cdot \overline{B} + \overline{A} \cdot B$ . Two terms  $\overline{A} \cdot B$  and  $A \cdot \overline{B}$  represent two exclusive terms under the two-valued system. Signal probability of outputs of scan segments can be thought of as 0.5 if it is unknown at that clock cycle. According to Lemma 1, we have

$$C_1(y) = C_0(A) \cdot C_1(B) + C_1(A) \cdot C_0(B)$$
  
= 0.5 \cdot (C\_1(B) + C\_0(B)) = 0.5 (2)

where the scan-in signal A is random, therefore,  $C_1(A) = C_0(A) = 0.5$ . Randomness of the input signals of the scan segments is always guaranteed.

## B. Testability Estimation

Calculations of the testability measure for the proposed BIST scheme are similar to the scheme introduced by Tsai *et al.* [25], however, there exist some differences. The COP measure [4] is still used as the basic measure to calculate testability of the combinational gates. The proposed test scheme is as follows. Let n be the length of the scan segments. The first n cycles are shift cycles, which are followed by a capture cycle. As shown in Fig. 6, the test-application scheme is partitioned into four different phases, where testability estimation of the first test cycle

and other test cycles is different. Each test cycle is partitioned into: 1) the shift cycles and 2) the capture cycle. The shift cycles of each test cycle are further partitioned into three parts for the *i*th scan flip-flop of each scan segment: 1) the first (i - 1) shift cycles; 2) the shift cycles after the (i - 1)th clock cycle except the *n*th shift cycle; and 3) the *n*th shift cycle.

Let us consider controllability first and initial values at all scan flip-flops be unknown. It is reasonable to set signal probabilities of the PPIs as 0.5 at this point. As for the first test cycle, the *i*th scan flip-flop should be unknown for the first (i - 1) clock cycles.  $C_1(PPI)$  of the scan flip-flop is 0.5 for clock cycles  $(i-1) \le j \le n$ . Certainly,  $C_1(PPI)$  of the *i*th scan flip-flop at the capture cycle of the first test cycle is 1-controllability of its data input at the *n*th clock cycle. Controllability of the PPI for the *i*th scan flip-flop during the first phase of other test cycles is different from the first test cycle. As for other test cycles during the first phase, values of the first (i - 1) flip-flops captured at the previous test cycle are shifted to it sequentially.

Let us consider observability in one test cycle. Observability of each pseudoprimary output (PPO) is the same for all test cycles. Observability of all PPO is 0.0 during the first n - 1shift cycles. As for the last shift cycle, observability of the PPO O(PPO) is the same as the observability of its output at the capture cycle of the same test cycle. It should be noted that observability of the input of a scan flip-flop is the same as the observability of its output for the next cycle because the value at the input of the flip-flop will be used as test in the next clock cycle. Therefore, that value is propagated to its output directly. As for the last phase, that is the (n + 1)th clock cycle, observability of the input of a scan flip-flop is 1.0. The captured value will be shifted out through the scan chain during the shift cycles of the next test cycle. Observability of the primary outputs is 1.0 for all clock cycles.

## V. DETERMINATION OF THE NUMBER OF SCAN CHAINS AND SCAN CHAIN PARTITIONING

We would like to select an optimal number of scan chains and optimal scan segment length in order to present the best testability. One may have the following conjecture: While the number of scan chains be fixed, the more the number of scan segments in a scan chain, the better is the testability of the circuit is. Anyway, it may not be true. There may exist a number of choices for the number of scan chains and the number of scan segments in each scan chain. Different choices make different testability of the circuit. Our method uses techniques to select the number of scan chains, which can also determine the length of the scan chain. We estimate testability in one test cycle using (3)–(5) after all nodes have got specified values. Let n be the length of the scan segments. The selection of the number of scan chains and the number of scan segments is determined by the following testability cost functions:

$$G = \frac{1}{|F|} \cdot \sum_{l/i \in F} \frac{1}{D(l/i)}$$
(3)

$$D(l/i) = \frac{1 - \prod_{j=1}^{j=n+1} \left(1 - pd(l/i, j)\right)}{n+1}$$
(4)

$$pd(l/i,j) = C_{\overline{i}}(l,j) \cdot O(l,j)$$
(5)

where l/i represents the stuck-at  $i(i \in \{0,1\})$  fault at line l, and  $\overline{i}$  is 0 (or 1) if i = 1 (or i = 0). In (5),  $C_{\overline{i}}(l, j)$  and O(l, j)represent  $\overline{i}$ -controllability of line l, and observability of line lfor the *j*th clock cycle, pd(l/i, j) is the probability for the fault l/i to be detected by the *j*th clock cycle of a certain test cycle. And D(l/i) is the average detection probability of fault l/i for each clock cycle. *F* is the hard-to-test fault set that contains the faults, whose detection probability is no more than ten times of that of the hardest fault. It should be noted that our method does not consider redundant faults according to the COP measure [4]. This technique is quite reasonable because test length of a circuit is usually determined by the hard-to-test faults.

It is reasonable that testability of a circuit changes when the number of scan chains and the length of each scan segment change. Observability of a fault can be enhanced when the length of scan segments decreases. Controllability of a line also changes when the number of capture cycles increases. This also makes the number of test vectors change if the number of clock cycles is fixed. Controllability of a PPI is a little complex because a biased vector is applied to the PPI during the capture cycle, and a part of the signals shifted to the PPIs should be the test responses captured after applying the test vector of the previous test cycle. Details can be found in Fig. 6 as stated in Section IV. The target for our method is to select the number of scan chains and the number of scan segments for each scan chain in order to minimize the cost function as presented in (3), which does not incur any new signal correlation.

The procedure as shown in Fig. 7 is applied to select the number of scan chains and the length of scan segments. Our method sets a constraint on the number of scan chains, and the length of scan segments. In this paper, the number of scan chains is limited to no more than 20 (MAX-NCHAIN), and the length of scan segments is set too no less than 10 (MIN-SEG) and no more than 30 (MAX-SEG). The minimal number of segments for each scan chain is no less than MIN-NSEG (2). Our method does not use PS. The parameters MAX-NCHAIN, MIN-SEG, and MAX-SEG are determined empirically. It is found that these parameters work well for all circuits used in the experiments. The number of scan flip-flops contained in each scan segment is no more than half of the scan flip-flops in a scan chain. Fig. 7

MIN\_SEG = 10, MAX\_SEG = 30, MIN\_NSEG = 2, MAX\_NCHAIN = 20 optimal\_seg\_length = MIN\_SEG; optimal\_n\_seg = MIN\_NSEG; current\_gain = gain(optimal\_seg\_length, optimal\_n\_seg); for(seg\_length=MIN\_SEG to MAX\_SEG) for (n\_seg = MIN\_NSEG to MAX\_NSEG) n\_chain = nff/(seg\_length \* n\_seg); if(n\_chain < MAX\_NCHAIN) new\_gain = gain(seg\_length,n\_seg); if (new\_gain < current\_gain) optimal\_seg\_length = seg\_length; optimal\_n\_seg = n\_seg; } ł optimal\_n\_chain = nff/(optimal\_seg\_length \* optimal\_n\_seg);

return optimal\_n\_chain, optimal\_n\_seg, optimal\_seg\_length.

Fig. 7. Procedure to select the number of scan chains and the length of scan segments.

presents the detailed procedure to determine the number of scan chains and the length of scan segments.

The parameter MAX-NSEG can be determined as follows. The maximum number of scan flip-flops (represented by  $n_1$ ) that can be included in the same group without generating any new reconvergent fanouts is found first based on [27]. The sizes of the scan flip-flop groups can be regulated easily in order for them to have similar sizes. The number  $n_1$  is obtained for the regulated scan flip-flop groups. The maximum number of scan flip-flops (represented by  $n_2$ ) that can be included in the same group can be obtained similarly, where all scan flip-flops in the same group do not have any common predecessor in the combinational part of the circuit. These scan flip-flop groups can also be regulated in order for them to have similar sizes. The parameter  $n_2$  is the maximum number of scan flip-flops that are contained in a group. The parameter MAX-NSEG is set as  $\min(n_1, n_2)$ .

Two useful conditions as mentioned in Section III-B are adopted to partition scan segments and configure scan chains. These conditions help to avoid lots of useless testability calculation. In our method, the scan flip-flops in the last scan segment of a scan chain construct a separate scan segment if the number is no less than half the number of the regular scan segments; otherwise, scan flip-flops in different scan chains can be reconfigured as regular scan segments. Scan flip-flops in the last scan segments of the scan chains can be merged into a number of regular scan segments, by fulfilling the two conditions stated earlier in Section III-B.

#### VI. TEST POINT INSERTION FOR SCAN-BASED BIST

Many of the previous scan-based BIST methods inserted control test points into the circuit directly, where extra inputs of the control points are driven by one stage of the PS. This technique can cause problems in operational mode. During the operational



Fig. 8. Test point architecture for scan-based BIST.

mode, it may be difficult to set the control inputs of the control test points as sensitization values. Additionally, each stage of the PS drives only one control point, which can produce nontrivial area overhead. Other methods inserted a constant 1 or a constant 0 into the nodes with control test points during testing. This technique can make a large number of testable faults untestable [26], [28]. New test point structures are presented to handle these problems.

Fig. 8 presents the general test point architecture of the proposed method. Extra pins of test points  $l_1, \ldots, l_i$  are connected with the primary inputs, and extra pins of control test points  $l_{i+1}, \ldots, l_c$  are connected with the PPIs. An AND gate is used as a switching logic for each 1-control test point. Similarly, an OR gate is used as the switching logic for a 0-control test point. The extra input of an 1-control point is connected with the AND gate directly, while the extra input of a 0-control point is also connected with the OR gate directly. All switching gates are connected with a single extra input test that is the same one that controls the scan chains. The circuit is set to the test mode when test = 1, while it is set to the operational mode if test = 0. Therefore, only a single gate delay is added to the functional paths for each control point. As shown in Fig. 8, 1-control test points are inserted into  $l_1$  and  $l_{i+1}$ , and 0-control test points are inserted into nodes  $l_i$  and  $l_c$ . Control points can be inserted away from critical paths if necessary. It should be noted that the proposed test point architecture does not need the extra stages of the PS, and thus, saves some extra area overhead compared with the ones that directly drive the control test points by the PS.

After the number of scan chains and the number of scan segments for each scan chain have been determined as stated in Section VI, our method selects the locations to insert test points. In this paper, only control test points are inserted to improve random testability of the circuit. Test points are selected based on the COP testability measure [4], the testability estimation scheme as presented in Section IV and the gain function as given in (3).

The hardest-to-test fault and faults with detection probability no more than 10 times of it fall into the random resistant fault set F. All faults in F and their immediate successors and predecessors are selected as test point candidates. For each test point candidate, two different classes of control test points are considered. Our method estimates testability improvement potential based on (3) for each test point candidate and two types of test points (1-control and 0-control test points). A selective tracing scheme to update testability measure is proposed to estimate testability improvement potential when inserting a test point. Our method selects the node and type of test point with the greatest testability improvement potential to insert a test point in each round. Test points are inserted one by one based on the above scheme. Testability of the circuit and test point candidates are updated after a test point has been inserted. The details of the selective tracing scheme can be stated as follows:

- Testability of the node *l* with a test point inserted is updated. When a 1-control test point is inserted into the node, its 1-controllability measure is set as 1 − 0.5 · (1 − C<sub>1</sub>(*l*)); when a 0-control test point is inserted into the node *l*, its 0-controllability measure is set as 1 − 0.5 · (1 − C<sub>0</sub>(*l*)). Here, C<sub>1</sub>(*l*) and C<sub>0</sub>(*l*) are the original 1-controllability measure and 0-controllability measure of the node *l*, respectively.
- 2) Controllability of the output of a gate (or a flip-flop) is updated if the controllability of its inputs changes.
- 3) Observability of all other inputs of the gate is updated if controllability of one of its inputs changes.
- 4) Observability of a node is updated if observability of one of its successors changes.

Unlike almost all of the previous methods [7], [19], [23], [25] that connect a control test point with one bit of the PRTG or the

outputs of the PS, our method connects extra pins of the control points with the primary inputs or the outputs of the scan flip-flops. Therefore, control test points do not contribute to the test input number and the number of stages of the PS. The test point architecture used in this paper is similar to the one in [26] as presented in Fig. 8. A new test point architecture in [26] and [28] was proposed for nonscan design for testability, which can obtain fault coverage comparable to scan design. This architecture is modified for scan-based BIST. However, the test point architecture in this paper is different from the one in [26], [28] as follows.

- The test point architecture in [26] and [28] was proposed for the nonscan circuit, where extra pins of control test points are only connected with primary inputs, but extra pins of control test points in this paper can be connected with not only the primary inputs, but also with outputs of scan flip-flops.
- 2) The principle utilized in [26] and [28] to connect one or more test points with a primary input is that no equal weight reconvergent fanout should be generated. However, the principle to connect extra pins with primary inputs or PPIs in this paper is not to generate new reconvergent fanouts in the combinational part of the circuit. That is, there exists no common successor between the node where a control test point is inserted and the original primary input (or the PPI) in the combinational part of the circuit. This is quite easy to implement for a fully scanned circuit.
- 3) Scan flip-flops close to inputs of the scan segments are selected to connect the extra pins of the test points, randomness of signals on the PPIs of those scan flip-flops are easy to be met according to the table presented in Fig. 6.
- Any primary input or PPI is connected with only one extra pin of a control test point. The proposed method inserts only a few control test points.

No primary input or PPI is connected with more than one control test point because the number of control test points is usually far smaller than the number of primary inputs and PPIs. It is quite easy to implement test point insertion as stated above. Control test points connected in the above way never cause any new signal correlation because we are considering the single stuck-at fault model. As for other fault models, such as the transition fault model, path delay faults, etc., the situation is similar. It is also similar for the multiple stuck-at fault model because a multiple stuck-at fault is activated if at least one single stuck-at fault is activated.

## VII. EXPERIMENTAL RESULTS

The proposed scan-based BIST scheme has been implemented. The PROOF's fault simulator is adopted to evaluate the proposed method scan chain partitioning (scp) and previous methods. Some circuits may have a large number of primary inputs. A compact PRTG is designed that drives multiple primary inputs by the same stage of the PRTG when the primary inputs do not have any common combinational successor in the combinational part of the circuit [5], [6]. This technique can improve the test effectiveness in most cases. Similarly, a compressed MISR can also be designed. Two primary outputs can be connected with an XOR gate if they do not have any common combinational predecessor in the combinational part of the test circuit. This scheme does not generate any aliasing. Additionally, the above techniques reduce the number of stages of the PRTG and the MISR greatly, and thus reduce area overhead.

Faults of the extra logic are excluded in order to present fair comparison. Performance of the proposed method scp is presented in Tables I and II in comparison with the BIST schemes *STS* based on the original parallel scan architecture with a single capture cycle for each test cycle and the *illinois scan* [9]. The PS proposed by Rajski *et al.* [21], [22] are combined with both the STS test scheme and the *illinois scan*. The number of scan chains used in the *illinois scan* is the same as that of the proposed method. All results are presented when 500 k clock cycles is used. As for the *illinois scan* [9], scan flip-flops are grouped randomly.

As shown in Table I, *nchains*, FC, *segments*, *nff*, #PI s,  $S_{prtg}$  and  $S_{misr}$  represent the number of scan chains, fault coverage, the number of scan segments in each scan chain, the number of scan flip-flops in each scan segment, the number of primary inputs in the circuit, the size of the PRTG, and the size of the MISR, respectively. The compact PRTG is utilized, where the input reduction scheme is combined. The compressed MISR is also utilized as stated earlier this section.

Table I presents results with no test point. The proposed method scp is compared with the *illinois scan* [9] and the STS test scheme with the long scan chains by combining the PS proposed in [21] and [22]. Our method outperforms the *illinois scan* for all circuits except s3384 and s4863. Compared with the STS test scheme, the proposed method works better for all circuits except the circuits s3384, s5378, and b14. It should be noted that the proposed method does not use any regular PS.

The proposed scan chain partitioning scheme can be thought of as a new PS. However, it is different from the regular PSs, which only makes the signals of different stages independent enough. A new test generation scheme is also proposed in our method, where interdependences among different signals driving different scan segments are avoided. As shown in Table II, *ntp* and *nff* represents the number of test points and the number of scan flip-flops in each scan chain for the STS test scheme with the PS [22], respectively. Table II presents comparison of the proposed method with the PS proposed by Rajski *et al.* [21], [22] on the ISCAS benchmark circuits. Area overhead is estimated based on the cell library class.lib of the Synopsys system.

As for the STS test scheme, scan chain length is set as 10 for all circuits. The STS test scheme utilized the PS proposed in [22] and the conventional test-per-scan BIST scheme. The proposed method generates better fault coverage than the STS test scheme for all circuits except s5378. The main reason should be the new test generation method utilized in the proposed method. The scp method works better for larger circuits with more scan flip-flops according to the results presented in Table II. The proposed scp test scheme is also compared with the *STS* test scheme combined with the PS [22] after test points are inserted into the

| circuits |         |          | illinois scan [10] | STS |        |       |        |       |       |
|----------|---------|----------|--------------------|-----|--------|-------|--------|-------|-------|
| circuits | nchains | segments | #PIs               | nff | S prtg | FC(%) | S misr | FC(%) | FC(%) |
| s1423    | 4       | 2        | 17                 | 10  | 21     | 98.67 | 9      | 96.27 | 98.24 |
| s1512    | 1       | 6        | 29                 | 10  | 30     | 99.45 | 15     | 91.40 | 95.96 |
| s3330    | 7       | 2        | 40                 | 10  | 47     | 92.79 | 65     | 87.77 | 90.59 |
| s3384    | 6       | 3        | 43                 | 10  | 49     | 95.92 | 25     | 96.02 | 96.27 |
| s4863    | 2       | 4        | 49                 | 13  | 23     | 97.60 | 18     | 98.13 | 97.14 |
| s5378    | 6       | 3        | 35                 | 10  | 23     | 97.56 | 33     | 95.72 | 98.09 |
| s6669    | 6       | 4        | 83                 | 10  | 34     | 99.75 | 27     | 98.90 | 99.72 |
| s9234.1  | 7       | 3        | 36                 | 11  | 21     | 88.06 | 25     | 81.38 | 87.03 |
| s13207   | 11      | 6        | 31                 | 11  | 31     | 97.85 | 40     | 93.61 | 96.66 |
| s13207.1 | 10      | 6        | 62                 | 11  | 34     | 98.31 | 58     | 93.69 | 97.17 |
| s15850   | 20      | 3        | 14                 | 10  | 25     | 94.12 | 45     | 88.37 | 93.06 |
| s15850.1 | 13      | 4        | 77                 | 11  | 57     | 94.50 | 65     | 90.86 | 92.90 |
| s35932   | 20      | 8        | 35                 | 11  | 34     | 91.97 | 63     | 91.84 | 91.63 |
| s38417   | 20      | 8        | 28                 | 11  | 32     | 96.57 | 52     | 92.92 | 93.05 |
| s38584   | 20      | 4        | 12                 | 19  | 26     | 96.08 | 47     | 94.0  | 95.42 |
| s38584.1 | 20      | 7        | 38                 | 11  | 34     | 96.23 | 39     | 92.67 | 95.60 |
| b14      | 12      | 2        | 32                 | 11  | 44     | 88.67 | 18     | 85.51 | 88.98 |
| b15      | 20      | 2        | 36                 | 12  | 33     | 86.31 | 27     | 80.70 | 83.34 |
| b20      | 20      | 2        | 32                 | 13  | 52     | 92.54 | 39     | 87.55 | 92.02 |
| b21      | 20      | 2        | 32                 | 13  | 52     | 91.81 | 39     | 87.46 | 91.22 |
| b22      | 20      | 3        | 32                 | 13  | 52     | 92.81 | 40     | 89.15 | 92.55 |

 TABLE I
 I

 Comparison of the Proposed Method With Previous Methods Before Inserting Test Points
 I

TABLE IICOMPARISON WITH THE PS IN [22]

|          | scp           |                      |       |      |                   |       | PS [22] + STS |               |                 |       |                   |      |       |  |
|----------|---------------|----------------------|-------|------|-------------------|-------|---------------|---------------|-----------------|-------|-------------------|------|-------|--|
| circuits | no test point | oint with test point |       |      | area overhead (%) |       |               | no test point | with test point |       | area overhead (%) |      |       |  |
|          | FC(%)         | ntp                  | FC(%) | PRTG | MISR              | ao(%) |               | FC(%)         | ntp             | FC(%) | PRTG+PS           | MISR | ao(%) |  |
| s1423    | 98.67         | 0                    | 98.67 | 8.98 | 10.9              | 27.7  | 10            | 98.30         | 0               | 98.30 | 23.9              | 11.8 | 43.5  |  |
| s1512    | 99.45         | 0                    | 99.45 | 13.4 | 11.6              | 31.2  | 10            | 96.28         | 5               | 97.29 | 32.1              | 12.4 | 50.7  |  |
| s3330    | 92.79         | 5                    | 99.08 | 8.82 | 18.2              | 33.4  | 10            | 91.50         | 5               | 97.73 | 19.4              | 19.1 | 44.9  |  |
| s4863    | 97.60         | 3                    | 99.27 | 3.76 | 4.48              | 12.3  | 10            | 97.54         | 5               | 97.54 | 17.1              | 5.88 | 27.0  |  |
| s5378    | 97.56         | 15                   | 98.99 | 3.62 | 6.98              | 16.6  | 10            | 98.18         | 0               | 98.18 | 13.2              | 8.33 | 27.5  |  |
| s13207   | 97.85         | 0                    | 97.85 | 2.52 | 3.75              | 13.7  | 10            | 97.44         | 0               | 97.44 | 7.40              | 5.85 | 20.7  |  |
| s13207.1 | 98.31         | 0                    | 98.31 | 2.57 | 4.78              | 14.6  | 10            | 97.31         | 0               | 97.31 | 9.43              | 6.95 | 23.6  |  |
| s15850   | 94.12         | 15                   | 97.07 | 1.75 | 3.42              | 11.2  | 10            | 93.64         | 15              | 96.19 | 5.31              | 4.23 | 15.6  |  |
| s15850.1 | 94.59         | 15                   | 97.39 | 2.96 | 4.62              | 13.2  | 10            | 93.48         | 15              | 96.28 | 9.09              | 6.12 | 20.8  |  |
| s35932   | 91.97         | 5                    | 99.84 | 1.79 | 2.77              | 12.1  | 10            | 91.80         | 5               | 99.88 | 5.82              | 5.21 | 18.6  |  |
| s38417   | 96.57         | 15                   | 98.57 | 1.64 | 2.33              | 10.7  | 10            | 95.85         | 15              | 98.05 | 5.04              | 4.43 | 16.2  |  |
| s38584   | 96.08         | 15                   | 97.15 | 0.92 | 1.61              | 8.63  | 10            | 95.46         | 15              | 96.83 | 4.31              | 3.99 | 14.4  |  |
| s38584.1 | 96.23         | 15                   | 97.42 | 1.55 | 1.94              | 9.52  | 10            | 95.94         | 15              | 97.15 | 4.92              | 3.76 | 14.7  |  |

circuit. As shown in Table II, the scp test scheme still outperforms the STS test scheme for all circuits except s35932. The STS test scheme utilizes a 24 stage pseudo random test pattern generator for all circuits like [22]. Additionally, all scan-out signals are connected with the MISR directly. However, the scp test scheme adopts a compressed MISR as stated earlier in this section. The scp test scheme utilizes a little larger PRTG than the STS test scheme. The STS test scheme needs an extra PS to combine with the PRTG. Area overhead details for the PRTG, MISR and the BIST scheme are presented in Table II. It is shown that the proposed scp method needs less hardware overhead for all circuits.

#### VIII. CONCLUSION

A scan chain partitioning scheme was presented to improve the test effectiveness of scan-based BIST. The method can insert capture cycles inside the shift cycles compared with the test scheme using long scan chains. Techniques were introduced to select the number of scan chains and the number of the scan chain segments and to group scan flip-flops. Another good feature of the proposed method is that the percentage of at-speed test can be enhanced compared to the test scheme with long scan chains because capture cycles can be inserted more frequently. The proposed scan architecture can be thought of as a new PS with a new test generation scheme unlike the previous PSs [21], [22]. Sufficient experimental results were presented to compare the proposed method with the popular parallel scan chain architecture combined with the PS in [21] and [22]. Experimental results show that the proposed method works better for almost all circuits on the test effectiveness and area overhead.

#### REFERENCES

- V. D. Agrawal, C. R. Kime, and K. K. Saluja, "A tutorial on built-in self-test, part 1: Principles," *IEEE Design Test Comput.*, vol. 10, no. 2, pp. 73–82, Apr. 1993.
- [2] P. H. Bardell and W. H. McAnney, "Self-testing of multichip logic modules," in *Proc. IEEE Int. Test Conf.*, 1982, pp. 200–204.
- [3] P. H. Bardell, W. H. McAnney, and J. Savir, Built-In Test for VLSI: Pseudo-Random Techniques. New York: Wiley, 1987.
- [4] F. Brglez, "On testability of combinational networks," in *Proc. IEEE Int. Symp. Circuits Syst.*, 1984, pp. 221–225.
- [5] K. Chakrabarty and B. T. Murray, "Design of built-in test generator circuits using width compression," *IEEE Trans. Computer-Aided Design Integr. Circuits Syst.*, vol. 17, no. 10, pp. 1044–1051, Oct. 1998.
- [6] C. A. Chen and S. K. Gupta, "Efficient BIST test pattern generator designs and test set compaction via input reduction," *IEEE Trans. Computer-Aided Design Integr. Circuits Syst.*, vol. 17, no. 8, pp. 692–705, Aug. 1998.
- [7] K. T. Cheng and C. J. Lin, "Timing-driven test point insertion for full scan and partial scan design," in *Proc. IEEE Int. Test Conf.*, 1995, pp. 506–514.
- [8] H. Fujiwara, *Logic Testing and Design for Testability*. Cambridge, MA: MIT Press, 1985.
- [9] I. Hamzaoglu and J. H. Patel, "Reducing test application time for full scan embedded cores," in *Proc. IEEE Int. Symp. Fault-Tolerant Comput.*, 1999, pp. 260–267.
- [10] S. Hellebrand, H. G. Liang, and H. J. Wunderlich, "A mixed mode BIST scheme based reseeding of folding counter," in *Proc. IEEE Int. Test Conf.*, 2000, pp. 778–784.
- [11] S. Hellebrand, J. Rajski, S. Tarnick, S. Venkataraman, and B. Courtois, "Built-in test for circuits with scan based on reseeding of multiple polynomial linear feedback shift registers," *IEEE Trans. Comput.*, vol. 44, no. 2, pp. 223–233, Feb. 1995.
- [12] Y. Huang, I. Pomeranz, S. M. Reddy, and J. Rajski, "Improving the property of at-speed tests," in *Proc. IEEE/ACM Int. Conf. Computer-Aided Design*, 2000, pp. 459–463.
- [13] R. Kapur, S. Patil, T. J. Snethen, and T. W. Williams, "A weighted random pattern generation system," *IEEE Trans. Computer-Aided Design Integr. Circuits Syst.*, vol. 15, no. 8, pp. 1020–1025, Aug. 1996.
- [14] G. Kiefer, H. Vranken, E. J. Marinissen, and H. J. Wunderlich, "Application of deterministic logic BIST on industrial circuits," *J. Electron. Testing: Theory Applicat.*, vol. 17, pp. 351–362, 2001.
- [15] K. Kim, D. Ha, and J. Tront, "On using signature registers as pseudorandom pattern generators in built-in self testing," *IEEE Trans. Computer-Aided Design Integr. Circuits Syst.*, vol. 7, no. 7, pp. 919–928, Jul. 1988.

- [16] S. Kim and B. Vinnakota, "Fast test application technique without fast scan clock," in *Proc. IEEE/ACM Int. Conf. Computer-Aided Design*, 2000, pp. 464–467.
- [17] B. Konemann, J. Mucha, and C. Zwiehoff, "Built-in logic block observation technique," in *Proc. IEEE Int. Test Conf.*, 1979, pp. 37–41.
- [18] K. J. Lee, J. J. Chen, and C. H. Huang, "Using a single input to support multiple scan chains," in *Proc. IEEE/ACM Int. Conf. Computer-Aided Design*, 1998, pp. 74–78.
- [19] C. J. Lin, Y. Zorian, and S. Bhawmik, "Integration of partial scan and built-in self-test," *J. Electron. Testing: Theory Applicat.*, vol. 1–2, pp. 125–137, 1995.
- [20] R. Lisanke, F. Brglez, A. J. Degeus, and D. Gregory, "Testability-driven random test pattern generation," *IEEE Trans. Computer-Aided Design Integr. Circuits Syst.*, vol. 6, no. CAD-11, pp. 1082–1087, Nov. 1987.
- [21] J. Rajski, N. Tamarapalli, and J. Tyszer, "Automated synthesis of large phase shifters for built-in self-test," in *Proc. IEEE Int. Test Conf.*, 1998, pp. 1047–1056.
- [22] —, "Automated synthesis of large phase shifters for built-in self-test applications," *IEEE Trans. Computer-Aided Design Integr. Circuits Syst.*, vol. 19, no. 10, pp. 1175–1188, Oct. 2000.
- [23] N. Tamaramalli and J. Rajski, "Constructive multi-phase test point insertion for scan-based BIST," in *Proc. IEEE Int. Test Conf.*, 1996, pp. 649–658.
- [24] N. Touba and E. J. McCluskey, "Test point insertion based on path tracing," in Proc. of IEEE VLSI Test Symp., 1996, pp. 2–8.
- [25] H. C. Tsai, K. T. Cheng, and S. Bhawmik, "On improving test quality of scan-based BIST," *IEEE Trans. Computer-Aided Design Integr. Circuits Syst.*, vol. 19, no. 8, pp. 928–938, Aug. 2000.
- [26] D. Xiang and H. Fujiwara, "Handling the pin overhead problem of DFT's for high-quality and at-speed test," *IEEE Trans. Computer-Aided Design Integr. Circuits Syst.*, vol. 21, no. 9, pp. 1105–1113, Sep. 2002.
- [27] D. Xiang, S. Gu, J. Sun, and Y. L. Wu, "A cost-effective scan architecture for scan testing with nonscan DFT test power and test application cost," in *Proc. ACM/IEEE Design Automation Conf.*, Anaheim, CA, 2003, pp. 744–747.
- [28] D. Xiang, Y. Xu, and H. Fujiwara, "Non-scan design for testability for synchronous sequential circuits based on conflict resolution," *IEEE Trans. Comput.*, vol. 51, no. 8, pp. 1063–1075, Aug. 2003.



**Dong Xiang** (M'01–SM'04) received the B.S. and the M.S. degrees in computer science from Chongqing University, Chongqing, China, in 1987 and 1990, respectively, and the Ph.D. degree in computer engineering from the Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China, in 1993.

He visited Concordia University, Montreal, Canada, as a Postdoctoral Researcher from 1994 to 1995 and the University of Illinois, Urbana–Champaign from 1995 to 1996. He also visited the Nara

Institute of Science and Technology as a JSPS Invitation Fellow from April to September in 2003. He was with the Institute of Microelectronics from October 1996 to March 2003 as an Associate Professor. He is currently a Professor in the School of Software, Tsinghua University. He authored *Digital Systems Testing and Design for Testability* (Beijing, China: Science Press, 1997). His research interests include design and test of digital systems, including design for testability, testing, testability analysis, and built-in self-test, fault-tolerant computing, distributed computing, and computer networking.

Dr. Xiang is a Member of the IEEE Computer Society.



**Ming-Jing Chen** received the B.S. degree in electronic engineering in 2002 from Tsinghua University, Beijing, China, where he is currently working toward the M.S. degree.

His research interests include built-in self-test, design for testability, and low-power testing.



**Jia-Guang Sun** received the B.S. degree in computer science and engineering in Tsinghua University, Beijing, China, 1970.

He is now a Professor in the Department of Computer Science and Technology, and the Dean of the School of Software, Tsinghua University. His research interests include computer graphics, computer-aided design, and computer-aided management.

Prof. Sun is a Member of the Chinese Academy of Engineering.



**Hideo Fujiwara** (S'70–M'74–SM'83–F'89) received the B.E., M.E., and Ph.D. degrees in electronic engineering from Osaka University, Osaka, Japan, in 1969, 1971, and 1974, respectively.

He was with Osaka University from 1974 to 1985, with Meiji University from 1985 to 1993, and then joined Nara Institute of Science and Technology, Nara, Japan, in 1993. In 1981, he was a Visiting Research Assistant Professor at the University of Waterloo, Ontario, Canada, and, in 1984, he was a Visiting Associate Professor at McGill University,

Montreal, Canada. Presently, he is a Professor at the Graduate School of Information Science, Nara Institute of Science and Technology. He is the author of *Logic Testing and Design for Testability* (Cambridge, MA: MIT Press, 1985). His research interests are in logic design, digital systems design and test, VLSI CAD and fault-tolerant computing, including high-level/logic synthesis for testability, test synthesis, design for testability, BIST, test-pattern generation, parallel processing, and computational complexity.

Prof. Fujiwara is an Advisory Member of the *IEICE Transactions on Information and Systems* and an Editor of the IEEE TRANSACTIONS ON COMPUTERS, the *Journal on Electronic Testing*, the *Journal Circuits, Systems and Computers*, the *Journal on VLSI Design*, and others. He received the IECE Young Engineer Award in 1977, the IEEE Computer Society Certificate of Appreciation Award in 1991, 2000, and 2001, the Okawa Prize for Publication in 1994, the IEEE Computer Society Meritorious Service Award in 1996, and the IEEE Computer Society Outstanding Contribution Award in 2001. He is a Golden Core Member of the IEEE Computer Society, a Fellow of the Institute of Electronics, Information and Communication Engineers of Japan (IEICE), and a Member of the Information Processing Society of Japan.