# A Reconfigurable Scan Architecture With Weighted Scan-Enable Signals for Deterministic BIST

Dong Xiang, Senior Member, IEEE, Yang Zhao, Krishnendu Chakrabarty, Fellow, IEEE, and Hideo Fujiwara, Fellow, IEEE

Abstract—We present a new scan-based built-in self-test (BIST) technique, which is based on weighted scan-enable signals and a reconfigurable scan-forest architecture. A testability measure is proposed to guide test pattern generation and produce patterns with few care bits. This approach can effectively reduce the amount of test data that needs to be stored on-chip. The proposed BIST method relies on the pseudorandom and deterministic phases. The scan-forest architecture is configured as a single scan tree for deterministic test vector application in the second phase. It is found that a linear feedback shift register, with size equal to the maximum number of the care bits in the deterministic patterns for the random-resistant faults, is sufficient to encode deterministic vectors for the benchmark circuits. Experimental results for benchmark circuits demonstrate the effectiveness of the proposed method for single stuck-at faults. In addition, experimental results show that the patterns applied to the circuit under test provide more n-detection than those applied by a traditional scan-chain architecture with a single test session.

*Index Terms*—Deterministic built-in self-test (BIST), scanbased BIST, scan forest, weighted scan-enable signals.

#### I. INTRODUCTION

T HE TEST length for scan-based built-in self-test (BIST) [1], [2], [9] is usually determined by the random-patternresistant (hard-to-detect) faults. Test data compression [19] and test response compaction [41] are still two important issues for BIST. As for delay testing, it is much more difficult to reach a complete coverage with deterministic BIST [46]. Test length reduction for the hard-to-detect faults is therefore an important practical problem. The most popular techniques include the following: 1) weighted random testing [7], [16], [18], [31], [34]; 2) test point insertion [51]; and 3) deterministic scan-based BIST [22], [23]. The test application time is also an important issue for scan-based BIST.

Weighted random testing refers to the application of test patterns derived by using primary inputs (PIs) with signal probabilities that are different from 0.5 [7], [43], [49]. This technique helps us to reduce the test length and, thereby, the test

Y. Zhao and K. Chakrabarty are with the Department of Electrical and Computer Engineering, Duke University, Durham, NC 27708 USA.

H. Fujiwara is with the Graduate School of Information Science, Nara Institute of Science and Technology, Nara 630-0192, Japan.

Digital Object Identifier 10.1109/TCAD.2008.923260

application time needed to reach high fault coverage. Pomeranz and Reddy [34] proposed a weighted random test generation method by expansion of tests generated by a deterministic test generator. Three weights 0, 1, and 0.5 are assigned to all stages of the linear feedback shift register (LFSR). The scan cells must be modified for weighted random test vector generation in [43].

Weighted random testing schemes typically store multiple sets of weights, and multiple test sessions are necessary [16], [18], [30], [31], [34], [50]. Muradali et al. [31] proposed a weighted random testing method for scan-based BIST to reduce test time. Weighted random signal probabilities 0, 0.25, 0.5, 0.75, and 1 were assigned to the pseudoinputs of the scan cells. A weighted control logic was used to regulate the proportion of weighted inputs. However, this approach is intrusive because it requires a modification of the scan cells and an insertion of additional logic on functional paths. Jas et al. [16] proposed a hybrid weighted pseudorandom testing scheme to get a complete coverage by using an extension of the technique in [34]. The first phase corresponds to traditional pseudorandom testing, and the second phase includes multiple test sessions. Each test session of the second phase assigns different weight sets to the scan chains, and these weight sets are stored onchip. However, weighted random testing schemes with multiple test sessions may require more complex control logic [16], [18]. We therefore focus on a new weighted random test generation method that needs a simple control logic and relies only on a single test session.

It has been shown that the use of parallel scan chains alone may not be sufficient to obtain high fault coverage with low test time [7], [20]. A well-designed phase shifter (PS) has been shown to increase the fault coverage [38]. However, this approach typically requires a large number of (short) scan chains, as well as area overhead for the PS. Typically, the number of stages in the pseudorandom test pattern generator (PRTG) is fixed, and a well-designed PS [38] can be used to avoid interdependence among the outputs of the PRTG. The storage requirement for a deterministic BIST method with PS and multiple scan chains, however, can still be very large [20], [33], [39].

Complete fault coverage can be obtained when the pseudorandom test generator is modified [10]. A combination of a pseudorandom test generator and a combinational mapping logic was used by Chatterjee and Pradhan [10]. Techniques have also been developed to improve the effectiveness of scanbased BIST by using multiple capture cycle test schemes in [15] and [48]. A hierarchical test set structure called STAR-BIST [47] was proposed based on the fault-clustering phenomena,

Manuscript received March 17, 2007; revised July 24, 2007 and December 11, 2007. This work was supported in part by the National Science Foundation of China under Grant 60373009 and Grant 60573055. This paper was recommended by Associate Editor J. Lach.

D. Xiang is with the School of Software, Tsinghua University, Beijing 100084, China (e-mail: dxiang@tsinghua.edu.cn).

based on which a high-quality and low-cost test generator for BIST was introduced. Lai *et al.* [25] proposed a novel BIST scheme with a scan-chain segmentation. A scan chain is partitioned into multiple segments delimited by inversion elements, whereas the scan-in pin is fed by a single bit stream. Multiple segmentation configurations are however required for a complete coverage. Xiang *et al.* [51] proposed a scan-chain partitioning scheme to replace the PS in scan-BIST, which provided better results than the conventional single test session test scheme (i.e., multiple scan chain with a well-designed PS [38]). A well-constructed test response compactor was also presented for zero aliasing.

Deterministic scan-based BIST schemes usually utilize the large number of unspecified bits in deterministic test vectors. As shown by Koenemann [22], the deterministic test vectors can be encoded into LFSR seeds, whose size is equal to the maximum number of care bits  $S_{max}$  in the deterministic test vectors plus 20. The requirement on the size of the LFSR can be reduced to  $S_{\text{max}} + 4$  by using multiple primitive polynomials [13], [14]. Efficient test generation techniques have been presented to reduce the number of care bits in the deterministic test vectors [14], and test vector concatenation has been proposed to encode multiple patterns with a single seed. Rajski et al. [39] improved encoding efficiency by using variant-length seeds and multiple primitive polynomials. Krishna et al. [24] proposed a new test encoding scheme based on the loading of seeds with size less than the LFSR size; this method incrementally modifies the next seed during the test application. A test encoding scheme with multiple primitive polynomials and various ranks was proposed to compress deterministic test data [21].

Seed ordering and seed encoding were used to reduce the storage requirement in [3], where the test vector concatenation technique was improved significantly. The states of the circuit in separate clock cycles, instead of the state after the shift cycles of separate test cycles, can be used to encode the deterministic test vectors. Li and Chakrabarty [29] proposed a reconfigurable scan architecture to improve the encoding efficiency of deterministic BIST, where the interconnections between the outputs of the LFSR and the inputs of the scan chains can be dynamically reconfigured. These works are closely related to the work in this paper.

A single scan-in signal was used to drive multiple scan chains in [12], [28], and [52], which can compress the test data and reduce the test application time greatly. The scan tree or forest architecture has been proposed to reduce test application cost and test data volume [4], [6], [52], [54]. The reconfigured scan forest constructed a zero-aliasing test response compaction technique by connecting only the leaf scan flip-flops to the test response compactor, where the area overhead of the compactor is well controlled. Test data volume and test application cost can be reduced significantly. The methods in [4], [6], and [54] did not consider routing constraints.

Recently, Pomeranz and Reddy [36] proposed transparent scan, according to which the scan-in and scan-out pins and the scan-enable signal are handled as PIs or primary outputs (POs). An advantage of this technique is that it is not necessary to load test stimuli into all the scan flip-flops. However, some degree of sequential automatic test pattern generation (ATPG) is required. Moreover, the scan enables of the scan chains are assigned (unbiased) pseudorandom values when transparent scan is used for scan-based BIST. As shown in Table II, the fault coverage can be improved significantly if the unbiased pseudorandom values are replaced by weighted random values. Most recently, an efficient pseudorandom test generation scheme in [53] is proposed by assigning separate weighted signals to the scan-enable signals of the scan chains, in which almost all circuits can obtain complete or close-to-complete fault coverage with costeffective hardware overhead and a single session test scheme. The method in [53] uses the multiple scan-chain architecture, where the size of the PS must be appropriately chosen.

The main contributions of this paper include the following: 1) a new pseudorandom test generator; 2) a new PS; and 3) the use of a small LFSR to encode all deterministic test vectors. A new PRTG that uses weighted scan-enable signals is proposed; this pattern generator is based on a reconfigurable scan-forest architecture. The reconfigurable scan forest is constructed by using a routing-driven scheme. The proposed method obtains a much better fault coverage in the pseudorandom testing phase, which effectively compresses the test data for the deterministic BIST phase because of the new scan architecture. The size of the PS is reduced greatly because each stage of the PS drives a scan tree with multiple scan chains instead of a single scan chain [53]; therefore, area overhead is reduced a lot. A new testability measure is proposed to guide test generation for deterministic test vectors with fewer care bits. A new primitive or nonprimitive polynomial selection procedure is proposed, according to which the size of the LFSR is equal to the number of the maximum care bits of the deterministic vectors. This technique can also effectively reduce the test data volume that needs to be kept on-chip [13], [14], [20], [22], [24], [33], [39].

The BIST technique based on weighted scan-enable signals, as described in this paper, is markedly different from [53]. The earlier method in [53] uses a multiple scan chains and a PS architecture. In this paper, a reconfigurable scan forest under routing constraints is used. The area overhead of the PS here is much less because each stage of a PS drives a scan tree instead of a scan chain. The proposed method also uses a new polynomial selection scheme in the second (deterministic) phase; this method is used to select a primitive or nonprimitive polynomial such that deterministic test cubes can be encoded with the smallest possible LFSR. Finally, we present a new testability measure to guide test generation such that deterministic test vectors with fewer care bits can be obtained. Weighted scanenable signals are assigned to separate scan chains to improve the test effectiveness of scan-based BIST in the pseudorandom testing phase.

The rest of this paper is organized as follows. The scanbased BIST architecture based on a reconfigurable scan forest is described in Section II. The new scan architecture using a scan forest and weighted scan-enable signals is introduced in Section III. The new testability measure is introduced in Section IV to guide test generation for test vectors with fewer care bits. Details for the seed encoding procedure are presented in Section V. Experimental results are given in Section VI. Section VII concludes this paper.



Fig. 1. Weighted scan-enable signals for scan-forest-based BIST.

# II. RECONFIGURABLE SCAN ARCHITECTURE FOR DETERMINISTIC BIST

In this section, we describe the scan-based BIST architecture for the proposed two-phase BIST method. As shown in Fig. 1, the scan-forest architecture is used for pseudorandom testing in the first phase. Each stage of the PS drives multiple scan chains, where all the scan flip-flops at the same stage for all scan chains are driven by the same stage of the PS. The scanforest architecture of [52] is used, which is in contrast to the multiple scan-chain architecture used in [53]. This technique can greatly reduce the size of the PS compared with the multiple scan-chain architecture where each stage of the PS drives a scan chain. Therefore, the area overhead due to the PS can be reduced significantly.

For any scan tree with scan chains  $(v_{1,1}, v_{1,2}, \ldots, v_{1,n})$ ,  $(v_{2,1}, v_{2,2}, \ldots, v_{2,n}), \ldots, (v_{k,1}, v_{k,2}, \ldots, v_{k,n})$ , scan flip-flops in the groups  $(v_{1,1}, v_{2,1}, \ldots, v_{k,1})$ ,  $(v_{1,2}, v_{2,2}, \ldots, v_{k,2}), \ldots$ ,  $(v_{1,n}, v_{2,n}, \ldots, v_{k,n})$  do not have any common combinational successor. The test response compactor is an XOR gate network. Two scan chains  $(a_1, a_2, \ldots, a_n)$  and  $(b_1, b_2, \ldots, b_n)$  are connected to the same XOR gate if the scan flip-flop pairs  $(a_1, b_1), (a_2, b_2), \ldots$ , and  $(a_n, b_n)$  do not have any common combinational predecessors, respectively. This technique can greatly reduce the size of the multiple input signature analyser (MISR) and, therefore, the area overhead.

Some circuits have many POs. For example, the ISCAS89 benchmark circuits s35932, s38417, s38584, s15850, and s13207 have 320, 106, 278, 87, and 121 POs, respectively. It is expensive if all the POs are connected to the MISR directly, which makes the size of the MISR very large [42], [51]. As shown in Fig. 1, our method does not connect the POs of the circuit to the MISR directly but to the compactor in order to reduce the size of the MISR. Two POs can be connected to the

same XOR gate if they do not have any common combinational predecessor. POs  $PO_1, PO_2, \ldots, PO_k$  can be connected to the same XOR tree if any pair of the POs does not have any common combinational predecessor. The size of the MISR can thus be reduced to some degree.

The proposed BIST scheme uses separate scan-enable signals to drive the scan chains. We use the set of weights  $\{0.5, 0.625, 0.75, 0.875\}$  to control the scan-enable signals. These weights are easy to generate by using on-chip hardware [7]. In the pseudorandom testing phase, the PIs are directly driven by the PS. Each of the PIs is driven by a scan flip-flop of a scan chain in the deterministic BIST phase in order to reduce the number of care bits for the deterministic test vectors. The PIs are also grouped with all the scan flip-flops. A PI can be included into a scan flip-flop group if PI does not have any combinational successor with any of the scan flip-flops in the group. Let the scan flip-flops in the group be at the kth level of the scan tree. The PI in that group can be driven by any scan flip-flop at the (k-1)th level in the scan tree. This can be implemented by using a multiplexer for each PI, as shown in Fig. 1, where the multiplexer is controlled by the signal p. The PS is connected to the PI when p = 0 (pseudorandom testing phase), and the PI is connected to a scan tree when p = 1(deterministic BIST phase).

As shown in Fig. 2, one extra multiplexer is inserted before the scan-in signal of each scan tree. The signal p is used for test phase selection, and it is the same as in Figs. 2 and 3. The pseudorandom testing phase corresponds to p = 0, and the deterministic BIST phase is used when p = 1. The scan trees receive test signals from the PS during the pseudorandom testing phase. The PS drives the scan trees and the PIs only in the pseudorandom testing phase. The scan-forest architecture is reconfigured to a single scan tree during the deterministic BIST phase, where all PIs are driven by some internal scan For a single scan-chain architecture, the number of clock cycles needed to shift out test responses of the previous deterministic test vector is much more, which is equal to the number of scan flip-flops in the circuit. Usually, a small LFSR constructed by a primitive polynomial is sufficient when a well designed PS is adopted in the

mial is sufficient when a well-designed PS is adopted in the pseudorandom testing phase. In our method, a combination of a 24-stage LFSR and the PS from [38] is used to generate test patterns in the pseudorandom testing phase. The size of the LFSR needed for the deterministic BIST depends on the maximum number of care bits for any deterministic test vector. While two different LFSRs can be used for the two phases, a composite LFSR design can be used, as shown in Fig. 3. The two designs are essentially equivalent.

the depth of the scan forest in the pseudorandom testing phase).

For any degree less than 128, it is computationally feasible to generate enough primitive polynomials in reasonable time, out of which one (whose degree is equal to the maximum number of care bits in the deterministic vectors) can be selected to encode all deterministic test vectors. The tool that we used to generate primitive polynomials can only handle polynomials up to 128 degree of the word-length limit of the computer [37]. A nonprimitive polynomial can be selected to construct the LFSR for the deterministic Vectors are too large (greater than 128 in this paper); in this way, all the deterministic vectors can still be encoded. Nevertheless, we attempt to use only primitive polynomial can introduce significant area overhead.

The same signal p as the one shown in Fig. 1 is used to switch between two XOR feedback networks. The XOR network-1 is used for the pseudorandom testing phase, whereas the XOR network-2 is used for the deterministic BIST phase. All seeds stored in the ROM are shifted in from the scan-in signal. The overhead needed to implement the LFSR with two polynomials involves only one additional XOR network and a multiplexer. Only a few two-input XOR gates are needed if a primitive polynomial with very few terms is used to generate pseudorandom test vectors.

Furthermore, in the deterministic BIST phase, instead of using XOR network-2, we can also use multiple primitive polynomials with different degrees to encode the deterministic vectors. A 2-bit identifier for each primitive polynomial can be used to identify the primitive polynomials and stored as extra test data with each loaded seed. A decoder can be used to select the corresponding XOR network (instead of XOR network-2) as per the 2-bit identifier.

An effective seed encoding scheme is used here to reduce the storage requirements for the deterministic test patterns for the random-resistant faults. The test responses of each test vector are shifted out in k clock cycles, where k is the depth of the scan forest. The test responses are captured again when the state of the scanned circuit is compatible with any other deterministic test vector.

Let us consider the construction of the reconfigurable scan forest. Assume that the number of scan flip-flops at each level l and depth d of the reconfigurable scan forest is known. Giving the scan-in pin of a scan tree, l scan flip-flops are selected among all the scan flip-flops that can be driven by





Fig. 3. LFSR design for the two phases.

flip-flops in order to reduce the number of test inputs and the number of care bits in the deterministic test vectors for the hard faults.

In the deterministic BIST phase, the scan tree receives test vectors from the scan-in signal of the rightmost scan tree (see Fig. 2). The scan-in signal of the rightmost scan tree is connected to a stage of the LFSR directly in the deterministic BIST phase. The dashed lines shown in Fig. 2 connect the scan-in signals of the scan trees with output of the last scan flip-flop in one of the scan chains in its right scan tree. The outputs of all scan chains are still connected to the compactor during the deterministic BIST phase; this offers additional flexibility for test encoding. The test responses of the previous test vector can be shifted out with only a few clock cycles (corresponding to





Fig. 4. Scan chain with a weighted scan-enable signal.

the same test signal. This selection is carried out to minimize the total interconnect length. The routing overhead can be easily estimated by using tools such as Astro from Synopsys [5]. Experimental results reported in this paper were obtained by using the Astro tool. All the scan flip-flops driven by the same test signal meet the following condition. Each pair of the scan flip-flops that is driven by the same test signal has no combinational successor in the circuit [54]. Each scan flipflop p in the first level is connected to a scan flip-flop f at the second level that has the minimum distance from p among all the scan flip-flops that can be driven by the same test signal, where all the scan flip-flops at the second level have no common combinational successor. Continue the aforementioned process until the reconfigurable scan forest has been constructed. It is not necessary for the scan flip-flops at the same level of the same scan tree to be in the neighborhood. Experimental results on the interconnect overhead for the scan forest designed by using the aforementioned heuristic are presented in Section VI.

# III. SCAN-BASED BIST USING SCAN FOREST AND WEIGHTED SCAN-ENABLE SIGNALS

The *i* controllability  $C'_i(l)$   $(i \in \{0, 1\})$  of a node *l* is defined as the probability that a randomly selected input vector sets *l* to the value *i*. The observability O'(l) is defined as the probability that a randomly selected input vector propagates the value of *l* to a PO. The signal probability of a node is defined in the same manner as its one-controllability measure.

In the scan-based BIST architecture, as shown in Fig. 1, different weights  $e_1, e_2, \ldots$ , and  $e_k$  are assigned to the scanenable signals of the scan chains  $SC_1, SC_2, \ldots$ , and  $SC_k$ , respectively, where  $e_1, e_2, \ldots, e_k \in \{0.5, 0.625, 0.75, 0.875\}$ . We do not assign weight values less than 0.5 to the scanenable signals because we do not want to insert more capture cycles than scan shift cycles. We have developed an efficient method to select weights for the scan-enable signals of the scan chains. The selection of weights for the scan-enable signals is determined by the following testability gain function:

$$G = \sum_{l/i \in F} \frac{|C_1'(l) - C_0'(l)|}{O'(l)}$$
(1)

where l/i represents the stuck-at i ( $i \in \{0, 1\}$ ) fault at line l. In (1), F is the random-resistant fault set, which is defined as the set of faults whose detection probability is no more than ten times that of the hardest fault [7]. Note that our method does not consider redundant faults according to the COP measure when selecting weights. We attempt to minimize the testability gain function as given in (1).

Fig. 4 shows a scan chain with a weighted scan-enable signal, where  $S_{in}$ ,  $S_{out}$ , and test are the scan-in signal, the scan-out signal, and the scan-enable signal of the scan chain, respectively. Initially, all pseudo-PIs (PPIs) are assigned with a signal probability of 0.5, and the observability of the pseudo-POs (PPOs) is set to 1/n. Let p be the selected weight of the scan-enable signal, as shown in Fig. 4. Then

$$C'_{1}(\text{PPI}_{i}) = p \cdot C'_{1}(a_{i-1}) + (1-p) \cdot C'_{1}(\text{PPO}_{i}).$$
(2)

The observability of  $PPO_i$  can be estimated as follows:

$$O'(\operatorname{PPI}_i) = (1-p) \cdot O'(a_i) \tag{3}$$

$$O'(a_i) = 1 - (1 - O'(b_i)) \cdot (1 - O'(\text{PPI}_i))$$
(4)

$$O'(b_{i-1}) = p \cdot O'(a_i).$$
 (5)

We set the observability of the scan-out signal to one. Even though the output of a scan chain is connected to the test response compactor, we can make the aliasing negligible by carefully connecting the scan chains to the XOR gates, as described in Section II. We also have  $O'(a_n) = 1$  and  $C'_0(S_{in}) =$  $C'_1(S_{in}) = 0.5$ . Testability measures of the internal nodes and the PPIs and the PPOs can be calculated iteratively using the COP measures in (2)–(5). We find that the testability measures for all nodes in the benchmark circuits converge within a few iterations.

As shown in Fig. 4, the BIST control logic assigns weighted signals to the scan-enable signals. In a functional mode, the extra pin test is assigned with a value of zero. The circuit is set to the test mode when test is set to a value of one. In this case, the selected weight is assigned to a scan chain. The scan chain works in the shift mode when the weighted scan-enable signal is one, and it works under the capture mode when the scan-enable signal is set to a value of zero. The weighted signals are produced by a PS [7], the details of which are presented later in this section. Only one extra pin is necessary in the scan-based BIST design.

We assume that each scan chain uses separate scanenable signals. We assign weights from the set  $\{0.5, 0.625, 0.75, 0.875\}$  to the scan-enable signals of the scan chains. In the weight selection procedure, S is the scan-chain set, and SC is a specific scan chain. Initially, S contains all scan chains in the circuit. The procedure to determine weights can be described as follows. First, all scan chains use the regular test-per-scan scan-enable signals. That is, the scan-enable signals are set to



Fig. 5. Generating weighted scan-enable signals for scan-forest-based BIST.

one in scan shift cycles and set to zero in capture cycles. A test cycle for the original scan-enable signals contains k scan shift cycles followed by one capture cycle, where k is the length of the longest scan chain. Our method selects a weight for the first scan-chain scan-enable signal to minimize the gain function as presented in (1). After the best weight has been selected for the first scan chain, we select a weight for the scan-enable signal of the second scan chain that minimizes the cost function in (1). For each scan chain, if no weight can be selected, we leave its scan-enable signal as the one in conventional test-per-scan BIST scheme (the number of shift cycles is equal to the length of the scan chains, and a capture cycle follows). Continue the aforementioned process until appropriate weights have been chosen for all scan-enable signals of the scan chains.

Different weights can be obtained by connecting two or more pseudorandom signals. As shown in Fig. 5, a signal with a signal probability of 0.75 can be obtained from the output of a two-input OR gate, whose inputs are pseudorandom signals with a signal probability of 0.5. A signal with a signal probability of 0.875 can be obtained from the output of a three-input OR gate, whose inputs are pseudorandom signals with a signal probability of 0.5. As shown in Fig. 5, all weights are connected to a multiplexer which is controlled by the signal p. The selected weight is connected to the scan-enable signal of a scan chain when p = 0 in the pseudorandom testing phase. The signal test2 is connected to the scan-enable signals of all scan chains during the deterministic BIST phase, where the scan forest is reconfigured as a scan tree (i.e., d shift cycles followed by a capture cycle, where d is the depth of the scan tree in the deterministic BIST phase).

The weights for the scan-enable signals of the scan chains are determined exactly once. That is, the weights do not need to be updated during the test application. The extra logic to generate the weights for the test scheme consists of only nine gates, as shown in Fig. 5; hence, the overhead is negligible compared with previously published weighted test pattern generators. As shown in Fig. 5, four extra AND gates are connected with different weights, respectively, where the four two-input AND gates are connected to the test and weighted signals. The circuit is in normal functional mode when test is set to zero, and weights are assigned to the scan chains when test is set to one. The circuit is thus in the test mode when test is set to one.

select-weight-for-scan-enables()

- {
  - 1) Assign the same values to the scan-enable signals as the regular test-per-scan BIST scheme to all scan chains.
- 2) While the scan chain set  $S \neq \emptyset$ , do
  - a) Select a scan chain SC from the scan chain set S,  $S = S - \{SC\}.$
  - b) Assign each weight in  $\{0.5, 0.625, 0.75, 0.875\}$  to the scan enable signal of scan chain SC, testability estimation is adopted to evaluate the cost function as presented in Equation (1).
  - c) Select the best weight  $w \in \{0.5, 0.625, 0.75, 0.875\}$  for the scan chain *SC* that makes the cost function as presented in Equation (1) minimum.
- 3) For each scan chain, if no weight can be selected, simply leave its scan-enable signal as the one in the conventional test-per-scan test scheme.

}

Fig. 6. Procedure to select different weights for the scan-enable signals of the scan chains.

Let all scan chains be assigned with separate scan-enable signals. We consider assigning one of the following weights  $\{0.5, 0.625, 0.75, 0.875\}$  to the scan-enable signals of the scan chains. In the weight selection procedure, as shown in Fig. 6, *S* is the scan-chain set, and SC is a specific scan chain. Initially, *S* contains all scan chains in the circuit, and the scan-enable signals of all scan chains are assigned that of the regular testper-scan test scheme. Controllability of the PPI of the *i*th scan flip-flop in a scan chain is set to 0.5, and observability of the PPO of the *i*th scan flip-flop is set to 1/k. Iterative testability estimation is adopted for all nodes based on (2)–(5) and the COP measure. It is found that testability measures for all nodes can be stable after quite a few rounds of testability calculation.

The procedure, as shown in Fig. 6, to determine the weights is as follows. First, all scan chains use the common test-perscan scan-enable signals. That is, the scan-enable signals are set to one in scan shift cycles and set to zero in capture cycles. A test cycle for the original scan-enable signals contains kscan shift cycles followed by one capture cycle, where k is the length of the longest scan chain. Our method selects a weight for the scan-enable signal of the first scan chain to minimize the cost function, as presented in (1). After the best weight has been selected for the first scan chain, our method selects the best weight for the scan-enable signal of the second scan chain that minimizes the cost function in (1). For each scan chain, if no weight can be selected, we leave its scan-enable signal as the one in a conventional test-per-scan BIST scheme. Continue the aforementioned process until appropriate weights have been chosen for all the scan-enable signals of the scan chains.

### IV. TESTABILITY MEASURE TO GUIDE TEST PATTERN GENERATION WITH FEWER SPECIFIED BITS

All PPIs corresponding to the same scan flip-flop group are merged into a single PPI. As shown in Fig. 1, each of the PIs shares the same PPI with the scan flip-flops in the same group. This technique reduces the number of test inputs, and it also reduces the number of care bits in the deterministic test vectors for the hard faults left after the pseudorandom testing phase.

Another simple SCOAP-like testability measure is presented to guide test pattern generation such that we obtain test vectors with even fewer care bits. The new testability measure includes the impact of reconvergent fan-outs. However, the SCOAP measure handles all inputs of a gate independently. The following additional definitions are introduced first. Let icontrol reachability  $RC_i(l)$  be defined as the minimum set of PIs (or PPIs) that have to be specified in order to set line l to value i, where  $i \in \{0, 1\}$ . The controllability  $C_i(l)$  is defined as the minimum number of PIs (or PPIs) that must be specified in order to place a control value  $i \in \{0, 1\}$  on line l. Let l be a PI. Equations (6)–(15) present the controllability calculation of the measure. We have

$$\mathbf{RC}_{1}(l) = \mathbf{RC}_{0}(l) = \{l\}$$
(6)

$$C_1(l) = C_0(l) = 1. (7)$$

For an AND gate *l* with inputs *A* and *B*, we have

$$\mathbf{RC}_1(l) = \mathbf{RC}_1(A) \cup \mathbf{RC}_1(B) \tag{8}$$

$$C_1(l) = |\mathbf{R}\mathbf{C}_1(l)| \tag{9}$$

where  $|\mathbf{RC}_1(l)|$  is the size of the set  $\mathbf{RC}_1(l)$ . The zero-control reachability function can be calculated as follows:

$$\mathbf{RC}_{0}(l) = \begin{cases} \mathbf{RC}_{0}(A), & \text{if } |\mathbf{RC}_{0}(A)| \le |\mathbf{RC}_{0}(B)| \\ \mathbf{RC}_{0}(B), & \text{if } |\mathbf{RC}_{0}(A)| > |\mathbf{RC}_{0}(B)|. \end{cases}$$
(10)

The zero-controllability measure of l can be calculated as follows:

$$C_0(l) = |\mathbf{R}\mathbf{C}_0(l)|.$$
(11)

For an OR gate l with inputs A and B

$$\mathbf{RC}_0(l) = \mathbf{RC}_0(A) \cup \mathbf{RC}_0(B) \tag{12}$$

$$C_0(l) = |\mathbf{R}C_0(l)|.$$
 (13)

The one-control reachability function at the output of the two-input OR gate can be calculated as follows:

$$\operatorname{RC}_{1}(l) = \begin{cases} \operatorname{RC}_{1}(A), & \text{if } |\operatorname{RC}_{1}(A)| \leq |\operatorname{RC}_{1}(B)| \\ \operatorname{RC}_{1}(B), & \text{if } |\operatorname{RC}_{1}(A)| > |\operatorname{RC}_{1}(B)|. \end{cases}$$
(14)

The one-controllability of the output of a two-input OR gate can be obtained as follows:

$$C_1(l) = |\mathbf{R}C_1(l)|.$$
 (15)

For a fan-out s with branches  $B_1, B_2, \ldots, B_k$ , for  $i \in \{0, 1\}$ , and  $j \in \{1, 2, ..., k\}$ , we have

$$C_i(B_j) = C_i(s).$$

The controllability calculation of the testability measure for other gates is similar. The function RO(l) is defined as the smallest set of PIs (or PPIs) that need to be assigned with specified values in order to propagate the fault effect at l either to a PO or to a PPO. The observability O(l) is defined as the minimum number of PIs (or PPIs) that need to be assigned with specified values in order to propagate the fault effect at *l* either to a PO or to a PPO. For a PO (or a PPO) *l*, we have O(l) = 0, and  $RO(l) = \emptyset$ . Equations (16)–(19) present observability calculation of the new testability measure. By considering an AND gate l with inputs A and B, we have

$$\mathbf{RO}(A) = \mathbf{RO}(l) \cup \mathbf{RC}_1(B) \tag{16}$$

$$O(A) = |\mathbf{RO}(A)|. \tag{17}$$

Let l be the output of an OR gate with inputs A and B. We have

$$\operatorname{RO}(A) = \operatorname{RO}(l) \cup \operatorname{RC}_0(B)$$
(18)  
$$O(A) = |\operatorname{RO}(A)|.$$
(19)

$$\mathcal{D}(A) = |\mathrm{RO}(A)|\,.\tag{19}$$

Let s be a fan-out stem with fan-out branches  $B_1, B_2, \ldots, B_k$ . Observability measure of the fan-out stem is calculated as follows:

$$O(F) = \min(O(B_1), O(B_2), \dots, O(B_k))$$

The observability calculation for other gate types is similar. Let l/i be a single stuck-at fault at l for  $i \in \{0, 1\}$ . We define the detectability D(l/i) as the minimum number of PIs (PPIs) that need to be specified in order to generate a test pattern for the fault. Therefore

$$D(l/i) = \left| \operatorname{RO}(l) \cup \operatorname{RC}_{\overline{i}}(l) \right|.$$
(20)

The detectability measure presented in (20) can be used to guide test point insertion in order to reduce the number of maximum care bits in the deterministic test vectors of the hard faults. Test point selection is used to reduce the maximum detectability measure of the hard faults in order to reduce the test data volume. The aforementioned testability measure is used to guide the test generation for test vectors with fewer care bits. For example, any one of the inputs can be specified to zero in order to set the output of a two-input AND gate to a value of zero. Our method selects the input with less zerocontrollability in this case. A fault effect at a fan-out stem can be propagated along any fan-out branch. Our method selects the fan-out branch with the least observability to propagate the fault effect. Test generation guided by the testability measure can thus obtain test vectors with fewer care bits. This is shown in the experimental results presented in Section VI.

## V. SEED ENCODING FOR DETERMINISTIC BIST WITH LOW STORAGE REQUIREMENT

We select either one primitive polynomial for deterministic seed encoding, which encodes all deterministic test vectors of hard-to-test faults. We also present a new technique to encode the deterministic vectors for hard faults. Finally, we present a procedure to select a nonprimitive polynomial that encodes all deterministic vectors when the number of care bits is too large.

#### A. Polynomial Selection for Deterministic BIST

A well-designed LFSR must be constructed in order to encode all deterministic vectors after the pseudorandom testing phase. A new procedure is proposed to select a primitive polynomial (or multiple primitive polynomials) with the minimum degree that can encode all deterministic test vectors for the hard faults as shown in Fig. 7. An efficient procedure is used to generate primitive polynomials of any desired degree (not more than 128). For any  $i \leq 128$ , the primitive polynomials are primitive-polynomial-selection()

{

- 1) Check all deterministic vectors and get the maximum number of specified bits  $S_{max}$ . Let all primitive polynomials with degree *i* be kept in  $Q_i$ ;  $i = S_{max}$ ;
- 2) for each  $p \in Q_i$ , do
  - a) check whether the LFSR constructed by *p* can encode all deterministic vectors;
  - b) continue the above process until finding a primitive polynomial *p* that encode all deterministic test vectors; exit.
  - c) i = i + 1 if no primitive polynomial with degree i has been found.
- 3) Return the selected primitive polynomial p.

```
}
```

Fig. 7. Procedure to select a primitive polynomial that encodes all deterministic test vectors.

stored in the  $Q_i$ . The following procedure returns a primitive polynomial with the minimum degree that encodes all deterministic vectors for the random-resistant (hard) faults.

Let the maximum number of care bits for the deterministic vectors be  $S_{\rm max}$ . The procedure as presented in Fig. 7 first checks all primitive polynomials of degree  $S_{\rm max}$ . Next, all primitive polynomials with degree  $S_{\rm max} + 1$  are checked. Continue the aforementioned process until a primitive polynomial that encodes all deterministic test vectors is found. Experiments show that very little CPU time is needed to check whether a primitive polynomial can encode all deterministic test vectors. For the one-detection criterion, i.e., every stuck-at fault is to be detected at least once, we find that deterministic vectors for the hard faults for all benchmark circuits can be encoded by a primitive polynomial with a degree equal to  $S_{\rm max}$ .

We can also use a nonprimitive polynomial to encode all deterministic vectors when the number of care bits is too large, where a primitive polynomial is unable to produce by using the tool in [37] when the maximum number of care bits for the deterministic test vectors of the hard faults is too large. Recently, Kagaris [17] studied to implement pseudoexhaustive testing with nonprimitive irreducible polynomials. The LFSR, as shown in Fig. 3, is used for two phases of deterministic BIST. In the first phase, the LFSR constructed by a primitive polynomial of 24 degree is used to generate random test vectors. Therefore, it is not necessary to use a primitive polynomial for the deterministic BIST if all deterministic vectors can be encoded by the nonprimitive polynomial. Let the maximum number of care bits of all deterministic test vectors be  $S_{\text{max}}$ . The procedure in Fig. 8 randomly selects a nonprimitive polynomial of degree  $S_{\text{max}}$  and checks whether the selected nonprimitive polynomial encodes all deterministic vectors. If not, select another nonprimitive polynomial. Continue the aforementioned process until a given number C (C is set to 1000 in this paper) of nonprimitive polynomial has been selected. Try a nonprimitive polynomial of degree  $S_{\text{max}} + 1$ , and so on, in a systematic way until a nonprimitive polynomial that encodes all the deterministic test vectors has been found. The number of nonzero terms of the selected nonprimitive polynomial determines the area overhead of the LFSR. Our method can constrain the number of nonzero terms in the selected nonprimitive polynomial. In most cases, a nonprimitive polynomial with degree non-primitive-polynomial-selection()

| 1) Check all deterministic vectors and get the maximum num-                                      |
|--------------------------------------------------------------------------------------------------|
| ber of specified bits $S_{max}$ , $i \leftarrow S_{max}$ ; $f \leftarrow 1$ , $Q \leftarrow 0$ ; |
| 2) while $f = 1$ , do                                                                            |
| a) Randomly select an <i>i</i> degree non-primitive polynomial                                   |
| $p, Q \leftarrow Q+1$ ; check whether the LFSR constructed by                                    |
| p can encode all deterministic vectors, if so, $f \leftarrow 0$ ;                                |
| b) if $f = 1$ and Q is greater than a given constant C,                                          |
| $Q \leftarrow 0,  i \leftarrow i+1.$                                                             |
| 3) Return the selected primitive polynomial $p$ .                                                |
|                                                                                                  |

Fig. 8. Procedure to select a nonprimitive polynomial that encodes all deterministic test vectors with a large number of care bits.

 $S_{\text{max}}$  can be found to encode all the deterministic test vectors after a very few number of trials.

For any circuits with deterministic test vectors that contain more than 128 care bits, a nonprimitive polynomial can be selected to encode all deterministic vectors based on the procedure, as shown in Fig. 8. Usually, a randomly selected polynomial with greater than 128 degree can encode all deterministic test vectors after a very small number of trials according to our experience. The selected polynomial can be a primitive or a nonprimitive polynomial in this case. Therefore, a polynomial that encodes all deterministic vectors can be selected in trivial time in all cases. The two LFSR architectures, as shown in Fig. 3, work well because a 24-stage LFSR constructed by a primitive polynomial generates pseudorandom test patterns in the pseudorandom testing phase. The nonprimitive polynomial constructed LFSR is only used to encode the deterministic test vectors of the random-resistant faults. Experimental results are presented to show in Section VII that a primitive polynomial to encode all deterministic test vectors can be selected in trivial time.

#### B. Seed Encoding Techniques

The test responses for a deterministic test vector must be shifted out by using L clock cycles, where L is the number of scan flip-flops in the longest scan chain. The reconfigurable scan architecture, as shown in Fig. 2, presents significant flexibility for seed encoding for the deterministic vectors. The test responses captured by the scan flip-flops can be shifted out in only  $L_1$  clock cycles, where  $L_1$  is the depth of the scan forest in the pseudorandom testing phase. For a single scan-chain architecture, the number of clock cycles needed to shift out test responses for the previous deterministic test vector is much more-it is equal to the number of scan flipflops in the circuit. The test responses are captured again when the state of the scanned circuit is compatible with any other deterministic test vector. The number of shifted clock cycles between two loaded seeds can be from  $2^{10}$  up to  $2^{15}$ . When we use multiple primitive polynomials with different degrees to encode all deterministic vectors, we need two additional bits of test data for each loaded seed to identify the primitive polynomial encoding this loaded seed.

Some additional test data are necessary to record the encoded seeds between two loaded seeds. These additional test data contain two parts. The first part has  $i_1$  bits, which represent the number of shift test cycles. Each of the  $i_1$  shift test cycles

contains  $L_2$  clock cycles, where  $L_2$  is the length of the scan tree in the deterministic BIST phase. The second part of the extra test data contains  $i_2$  bits, which record the number of shift clock cycles in the last shift test cycle. At this point, the status of the circuit is compatible with the encoded seed. By considering that the length of the scan tree in the deterministic BIST phase is 128,  $i_2 = 7$ , and  $i_1$  is not more than eight for any case.

The seed for the test vector with the largest number of care bits is first loaded into the LFSR. Test responses of the loaded seed are captured in the scan flip-flops. Our method begins to check the status of the scan tree whether it is compatible with any remaining deterministic test vector after the  $L_1$  shift cycles, where  $L_1$  ( $L_1 \ll L_2$ ) is the depth of the scan forest in the first phase. The test responses are captured again after a compatible state is found. The loaded seed can be reloaded in order to cover more remaining deterministic vectors. Our method loads the seed with the most number of care bits among the remaining seeds. Continue the aforementioned process until all seeds have been encoded or loaded. The number of shift clock cycles between the loaded and the encoded seeds is not more than  $2^{15}$  in all cases.

#### VI. EXPERIMENTAL RESULTS

The proposed method has been implemented and run on a Blade2000 workstation. The pseudorandom testing phase is used with the scan-forest scan architecture, and separate weighted scan-enable signals are assigned to the scan chains. The scan-forest architecture is reconfigured as a single scan tree in the deterministic BIST phase. The results for one detection, i.e., a stuck-at fault is detected at least once, are presented in Section VI-A. Results for *n*-detection evaluation are presented in Section VI-B, where each stuck-at fault is detected either by at least *n* different test vectors or by up to *n* different deterministic test vectors if it is not covered by the pseudorandom test vectors.

#### A. Results for One Detection

In this section, the ATALANTA [26] test generator is used to generate deterministic test vectors for the hard faults. The ATALANTA is modified from the FAN algorithm [11]. Table I presents results for the proposed method using the PROOFS fault simulator [32] on the ISCAS-89 [8] and ISCAS-93 circuits. The encoding method proposed in Section V is used to encode deterministic vectors. The column "LFSR" in Table I presents the size of the LFSR for the deterministic BIST phase. The size of the LFSR in the pseudorandom testing phase is set to 24 for all circuits. Column 2 (FC) presents fault coverage of the proposed new pseudorandom test scheme after 500k clock cycles. The columns NDTV and  $S_{\max}$  denote the number of deterministic test vectors and the maximum number of care bits in the deterministic test patterns, respectively. Note that the fault coverage reported in this paper is obtained by dividing the number of detected single stuck-at faults by the total number of faults (including redundant faults) in the circuit.

The columns corresponding to sts present the pseudorandom testing results obtained by using multiple scan chains (scanchain length is set to 10) with the PS in [38] after 500k clock cycles. The fault coverage for the proposed method is also

 TABLE I

 PERFORMANCE EVALUATION OF THE PROPOSED TECHNIQUE

| [       |          | Prop      | osed me | thod |       | sts       |      |
|---------|----------|-----------|---------|------|-------|-----------|------|
| Circuit | FC       | $S_{max}$ | NDTV    | LFSR | FC    | $S_{max}$ | NDTV |
|         | (phase1) |           |         |      |       |           |      |
| s1423   | 99.22    | 22        | 2       | 22   | 98.57 | 23        | 3    |
| s5378   | 99.3     | 16        | 2       | 16   | 98.65 | 17        | 18   |
| s9234   | 91.1     | 36        | 123     | 36   | 88.3  | 51        | 390  |
| s13207  | 98.48    | 18        | 14      | 18   | 97.74 | 17        | 38   |
| s15850  | 95.29    | 33        | 154     | 33   | 93.77 | 38        | 209  |
| s38417  | 98.79    | 44        | 163     | 44   | 95.93 | 86        | 763  |
| s38584  | 96.45    | 54        | 44      | 54   | 96.01 | 54        | 80   |
| s1512   | 96.58    | 30        | 23      | 30   | 96.22 | 25        | 31   |
| s3271   | 99.98    | 18        | 1       | 18   | 99.66 | 19        | 5    |
| s3330   | 97.33    | 55        | 54      | 55   | 91.69 | 55        | 125  |
| s3384   | 97.87    | 48        | 43      | 48   | 97.4  | 48        | 50   |
| s4863   | 100      | -         | -       | -    | 97.23 | 20        | 19   |

 TABLE
 II

 CPU TIME AND ROM STORAGE FOR THE PROPOSED TECHNIQUE

|         |           |      | Pro  | oposed : | method    |        | sts  | RSE   | [14]  |
|---------|-----------|------|------|----------|-----------|--------|------|-------|-------|
| Circuit | FC        | CPU3 | CPU1 | CPU2     | Number of | ROM    | CPU1 | FC    | FC    |
|         | (Phase 1) | (s)  | (s)  | (s)      | seeds     | (bits) | (s)  |       |       |
| s1423   | 99.22     | 0.3  | 85.7 | 0.02     | 2         | 37     | 82.1 | 99.05 | 98.34 |
| s5378   | 99.3      | 2.0  | 267  | 0        | 2         | 31     | 266  | 98.15 | 98.46 |
| s9234   | 91.1      | 14   | 1156 | 1.2      | 123       | 3589   | 987  | 89.68 | 90.47 |
| s13207  | 98.48     | 57   | 1867 | 0.4      | 14        | 213    | 1637 | 98.02 | 98.21 |
| s15850  | 95.29     | 93   | 2306 | 0.6      | 154       | 3423   | 1894 | 93.73 | 94.04 |
| s38417  | 98.79     | 668  | 2440 | 3.2      | 154       | 6174   | 3938 | 97.03 | 96.85 |
| s38584  | 96.45     | 410  | 3737 | 0.2      | 43        | 1396   | 3413 | 95.98 | 95.57 |
| s1512   | 96.58     | 0.2  | 89   | 0.03     | 23        | 673    | 88.7 | 93.83 | 95.78 |
| s3271   | 99.98     | 1.2  | 194  | 0        | 1         | 18     | 183  | 99.95 | 99.78 |
| s3330   | 97.33     | 1.1  | 236  | 0.63     | 54        | 2918   | 282  | 96.05 | 96.34 |
| s3384   | 97.87     | 1.4  | 249  | 0.13     | 43        | 1923   | 243  | 97.82 | 97.65 |
| s4863   | 100       | 1.2  | 275  | 0        | -         | -      | 275  | 99.98 | 98.73 |

obtained after 500k clock cycles for the pseudorandom testing phase. Compared with the sts test scheme, the size of the PS for the proposed method should be much smaller, whereas the test response compactor and the MISR are almost the same. Therefore, the area overhead for the proposed method is much less. The results are presented in Table IV.

Additional experimental results are presented in Table II. The parameter CPU1 denotes the CPU time (in seconds) for fault simulation by the weighted scan-enable test scheme and the conventional test-per-scan test scheme with a single test session. The column RSE refers to the test scheme where the scan-enable signals of the scan chains are assigned with values randomly. The proposed method and the sts test scheme need a similar fault simulation time. The parameters CPU2 and CPU3 denote the CPU time (in seconds) needed to select a primitive polynomial that encodes all deterministic test vectors for the hard faults and the CPU time (in seconds) needed to select weights for the scan chains, respectively. The results are reported for 500k clock cycles, as all in the other methods presented in Table II. Results are presented for some ISCAS-89 circuits, namely, s1423, s5378, s9234, s13207, s15850, s38417, and s38584, and some ISCAS-93 circuits, namely, s1512, s3271, s3330, and s4863. As for the RSE test scheme [36] that we implemented, the PS size should be much smaller (just like the sts test scheme to the proposed method); the test response compactor and the MISR are almost the same. No PS is used in the work of Jas et al. [16]; however, the area overhead for the test response compactor and the MISR should be close. Therefore, the area overhead for the method in [16] should be smaller, which is mainly because no PS was used.

Experimental results show that the scan-forest architecture and the PI merging scheme can reduce the maximum number of care bits for most benchmark circuits. The only exceptions are s38417 and s38584. The proposed new testability measure can further reduce the maximum number of care bits for the test vectors to detect the hard faults for circuits s5378, s15850, and s38417. It is particularly striking that the new testability measure can reduce the maximum number of care bits of the deterministic test vectors from 86 to 44 for circuit s38417. The proposed pseudorandom testing scheme with weighted scan-enable signals can increase the fault coverage of the first phase significantly for nearly all circuits. Additional pseudorandom test patterns do not increase the fault coverage for the benchmark circuits of the sts test scheme. The increase in fault coverage with weighted scan-enable signals is particularly noteworthy for circuits s9234, s3330, s15850, and s38417. Compared with the baseline pseudorandom test scheme based on the multiple scan-chain architecture [53], the new pseudorandom test scheme provides higher fault coverage for almost all circuits. Therefore, the proposed scheme can reduce the deterministic test data volume drastically.

Table II presents results of the weighted random pattern test generation scheme presented in [16], which presented a technique for compression of the data for the weight sets on the PIs and scan-in pins. The weighted random test pattern generation method is similar to the one presented by Pomeranz and Reddy [34]; however, the method in [34] does not handle weighted random pattern generation for scan-based circuits. The results in column [16] present the fault simulation results for 500k clock cycles based on the PROOFS fault simulator [32] (i.e., 50 000 random patterns). The scan-chain length is set to the same one presented in [16] for circuits s13207, s15850, s38417, and s38584. The scan-chain length is set to ten for all the remaining circuits. The first 20000 random patterns are generated in the same way as that of the conventional test-per-scan test scheme sts. The remaining 30000 patterns are partitioned into three separate test sessions, where every 10000 weighted random test patterns are generated based on the corresponding weight set.

Just like the test schemes with multiple capture cycles in [15] and [48], the weighted random pattern generation method in [16] receives test responses only at the capture cycles, which is a multiple test session test scheme with a single capture cycle for each test cycle. It is shown in Table II that results of the weighted random test pattern generation scheme in [16] are quite close to those of the RSE scheme, which are still apparently worse than the proposed weighted scan-enable-signal-based test scheme for all circuits. As mentioned earlier, the weighted scan-enable-based test scheme allows one to capture test responses in every clock cycle.

It is particularly noteworthy that for all benchmark circuits, our method can encode all deterministic test vectors of the hard faults successfully, with an LFSR of size equal to the maximum number of care bits for these test vectors. This can be attributed to two reasons: 1) The pseudorandom test generator is very efficient, and most hard faults that need more care bits are covered; and 2) selection of an appropriate primitive polynomial. The selection of an appropriate primitive polynomial can be completed very fast for all circuits.

TABLE III Comparison of the Proposed Method With Previously Published Methods

| Π       | SO [3] |      | FAST [28] |       | partial [19] |       | koe. [18] |       | Prop | oosed |
|---------|--------|------|-----------|-------|--------------|-------|-----------|-------|------|-------|
| Circuit | size   | ROM  | size      | ROM   | size         | ROM   | size      | ROM   | size | ROM   |
| s1423   | 50     | 450  | -         | -     | -            | -     | -         | -     | 22   | 37    |
| s5378   | 61     | 427  | 29        | 493   | 38           | 502   | 38        | 1140  | 16   | 31    |
| s9234   | 80     | 4560 | 33        | 4674  | 81           | 5013  | 81        | 11178 | 36   | 3589  |
| s13207  | 45     | 540  | 53        | 2824  | 44           | 3008  | 44        | 6908  | 18   | 213   |
| s15850  | 150    | 5250 | 49        | 5092  | 58           | 5204  | 58        | 9686  | 34   | 3423  |
| s38417  | -      | -    | 81        | 23984 | 105          | 24513 | 105       | 35700 | 44   | 6174  |
| s38584  | 200    | 3200 | 77        | 2848  | 75           | 2942  | 75        | 4650  | 54   | 1396  |

We compare our method with an early, but very influential, deterministic BIST method, i.e., [22]. We also carry out comparisons with more recent methods, e.g., [3], [12], [24], and [33]. The comparison is based on the test data volume needed to reach a complete coverage after pseudorandom testing. Table III presents the sizes of the LFSRs (size) needed and test data volumes for all these methods. It is shown that the proposed method leads to the lowest test data volume for all circuits using LFSRs of the smallest size.

The hardware overhead for a deterministic BIST scheme includes two parts: 1) the hardware overhead of the pseudorandom test generator, the PS, the test response, the compactor, and the MISR; and 2) the hardware overhead to store seeds of the deterministic vectors for the random-resistant faults. The FAST method [33] does not use any PS; therefore, area of the PS can be saved compared with the proposed method. The size of the PS used by the partial dynamic LFSR reseeding method [24] is expected to be similar to that used by the sts test scheme if the scan-chain length is set to the same value. The early deterministic BIST method [22] used a single scan chain, which needs the least area overhead. The PS and the MISR used the seed ordering method SO in [3] are similar to those of the sts test scheme. The size of the LFSR for the proposed test scheme can be reduced greatly compared with all previous methods; therefore, the area overhead for the LFSR is expected to be smaller. In most cases, the LFSR and the MISR contribute the most to the area overhead of a BIST scheme. Note that it is hard to present accurate area overhead comparisons with previously deterministic BIST methods because many implementation details of earlier work are not available.

In Table IV, we compare the hardware overhead for the proposed test scheme with the conventional sts scheme. The routing overhead comparison between the sts and the test schemes is also presented. It is shown that the area overhead of the new test scheme is much less than that of the sts scheme, and the routing overhead of the proposed test scheme is close to that of the sts scheme. Experiments were completed to demonstrate the effectiveness of the heuristic proposed for reconfigurable scan-forest construction with routing constraint consideration. Let wl(rsf) denote the total interconnect length for the reconfigurable scan forest. Let wl(org.) denote the interconnect length for a single scan-chain architecture. Next, we use wl'(rsf) to represent the interconnect length for the reconfigurable scan forest designed under routing constraints. Finally, let wl(sts)denote the total interconnect length for a multiple scan-chain architecture. In the following equations, WO represents the interconnect overhead of the reconfigurable scan forest relative to a single scan-chain architecture. The parameter WO(imp.)

TABLE IV Percentage Area Overhead of the Reconfigurable Scan Forest and Other Techniques Under Routing Constraints

| Circuit | AO(sts) | AO(rsf) | AO'(sts) | AO'(rsf) | WO    | WO(imp.) | WO(sts) | RO    | AO'(PS) |
|---------|---------|---------|----------|----------|-------|----------|---------|-------|---------|
| s9234   | 14.85   | 10.27   | 13.11    | 9.87     | 17.71 | 3.32     | 1.60    | 12.22 | 1.22    |
| s13207  | 24.09   | 17.58   | 24.09    | 17.58    | 28.04 | 11.89    | 3.36    | 12.61 | 1.31    |
| s15850  | 17.55   | 12.37   | 15.94    | 10.76    | 32.59 | 11.50    | 5.87    | 15.90 | 0.75    |
| s35932  | 25.90   | 17.40   | 25.90    | 17.40    | 22.26 | -0.89    | 1.71    | 18.93 | 0.47    |
| s38417  | 15.28   | 8.50    | 14.09    | 7.76     | 30.40 | 1.59     | 4.39    | 22.10 | 0.36    |
| s38584  | 19.04   | 12.96   | 18.33    | 12.72    | 33.94 | -3.77    | -3.70   | 28.15 | 0.26    |

#### TABLE V

COMPARISON OF THE PROPOSED TEST SCHEME WITH THE STS TEST SCHEME FOR *n*-DETECTION OF SINGLE STUCK-AT FAULTS

|         |     | n     | = 1    | n     | n = 2  |       | <i>n</i> = 4 |       | <i>n</i> = 5 |       | = 10    |
|---------|-----|-------|--------|-------|--------|-------|--------------|-------|--------------|-------|---------|
| Circuit |     | FC    | ROM    | FC    | ROM    | FC    | ROM          | FC    | ROM          | FC    | ROM     |
|         |     |       | (bits) |       | (bits) |       | (bits)       |       | (bits)       |       | (bits)  |
| s5378   | wse | 99.2  | 93     | 99.2  | 244    | 98.9  | 1487         | 98.7  | 2479         | 98.5  | 7770    |
|         | sts | 98.62 | 419    | 98.58 | 2194   | 98.03 | 6588         | 97.77 | 8705         | 97.7  | 25051   |
| s9234   | wse | 90.69 | 3595   | 88.92 | 12651  | 87.46 | 30071        | 86.94 | 34890        | 85.27 | 139380  |
|         | sts | 88.17 | 11266  | 86.07 | 36678  | 83.87 | 83694        | 82.93 | 106798       | 79.53 | 352882  |
| s13207  | wse | 98.5  | 417    | 98.2  | 1775   | 97.8  | 5585         | 97.7  | 7854         | 96.8  | 33000   |
|         | sts | 97.96 | 674    | 97.43 | 4427   | 96.39 | 15274        | 95.96 | 21972        | 93.04 | 145383  |
| s15850  | wse | 95.1  | 3648   | 94.3  | 12447  | 93.6  | 28186        | 93.4  | 35826        | 92.9  | 93415   |
|         | sts | 93.72 | 7260   | 92.71 | 27277  | 91.99 | 60524        | 91.75 | 76574        | 90.72 | 215409  |
| s38417  | wse | 98.7  | 6570   | 98.1  | 24458  | 97.3  | 77602        | 97.0  | 112518       | 95.8  | 314134  |
|         | sts | 95.49 | 79328  | 93.71 | 207712 | 91.73 | 531202       | 91.18 | 714283       | 89.71 | 1432673 |
| s38584  | wse | 96.3  | 906    | 96.2  | 2363   | 96.0  | 10034        | 95.9  | 14131        | 95.3  | 108030  |
|         | sts | 95.94 | 2718   | 95.71 | 6463   | 95.12 | 30178        | 94.74 | 42816        | 93.11 | 395180  |
| s3271   | wse | 99.9  | 19     | 99.9  | 80     | 99.9  | 220          | 99.8  | 280          | 99.6  | 1180    |
|         | sts | 99.65 | 133    | 99.5  | 388    | 99.24 | 1408         | 99.18 | 1908         | 98.94 | 6140    |
| s3330   | wse | 97.1  | 2858   | 95.9  | 7316   | 94.8  | 18913        | 94.3  | 25065        | 92.6  | 64980   |
|         | sts | 91.02 | 9553   | 90.09 | 20760  | 88.66 | 44672        | 88.25 | 56771        | 86.56 | 135250  |
| s3384   | wse | 97.7  | 2033   | 97.5  | 4470   | 97.3  | 9566         | 97.2  | 11928        | 97.1  | 24960   |
|         | sts | 97.37 | 3047   | 97.21 | 6831   | 97.08 | 13669        | 97.03 | 17401        | 96.9  | 38299   |
| s4863   | wse | 99.9  | 36     | 99.7  | 152    | 99.3  | 532          | 98.7  | 760          | 97.9  | 2940    |
|         | sts | 97.01 | 508    | 96.9  | 1493   | 96.75 | 2912         | 96.72 | 3665         | 96.64 | 7168    |

is the interconnect overhead of the reconfigurable scan forest under routing constraints relative to a single scan-chain architecture. The parameters WO(sts) and RO are defined in a similar manner as follows:

$$\begin{split} \text{WO} &= \frac{wl(\text{rsf}) - wl(\text{org.})}{wl(\text{org.})} \times 100\% \\ \text{WO}(imp.) &= \frac{wl'(\text{rsf}) - wl(\text{org.})}{wl(\text{org.})} \times 100\% \\ \text{WO}(sts) &= \frac{wl(\text{sts}) - wl(\text{org.})}{wl(\text{sts})} \times 100\% \\ \text{RO} &= \frac{wl(\text{rsf}) - wl'(\text{rsf})}{wl(\text{org.})} \times 100\%. \end{split}$$

Table IV presents results for the routing-constrained reconfigurable scan-forest design technique. The results show that the incorporation of routing constraints reduces the interconnect overhead for almost all circuits. In fact, the interconnect overhead is now comparable to that for a single scan-chain architecture. For two circuits, namely, s35932 and s38584, the interconnect overhead is even less than that for the single scanchain architecture.

In comparing the area overhead of the proposed method with the conventional sts test scheme, we note that two XOR networks are used to construct the LFSR in the former case, as a result of which a small number of additional XOR gates are necessary. The size of the LFSR needed here is less than that for sts because of the polynomial selection scheme proposed in this paper. In the reconfigurable scan-forest architecture, a single stage of the PS drives a scan tree instead of a scan chain (as is the case for sts). For example, for circuit s38417, a scanin signal of scan tree drives 20 scan chains simultaneously. This can reduce the size of the PS greatly and, therefore, reduces the area overhead of the proposed test scheme. The compactors and the MISRs for both schemes are similar.

Table IV shows the area overheads for the proposed method and for the conventional test-per-scan test scheme sts. It is shown that the area overhead for the proposed test scheme is much less than that of the sts scheme for all circuits. This reduction can be attributed to the smaller PS. The scan-chain length is set to ten for all experiments reported in Table IV. The area of a circuit is estimated based on the class.lib library of the Synopsys tool. The area overheads for the reconfigurable scan-forest-based test scheme are calculated by using (21). The area overhead of the sts test scheme can be estimated similarly

$$AO(rsf) = \frac{area \text{ of } rsf \text{ cir.} - area \text{ of } orig. \text{ cir.}}{area \text{ of } orig. \text{ cir.}} \times 100\%.$$
 (21)

In Table IV, AO'(sts), AO'(rsf), and AO'(PS) represent the area overheads of the sts, the proposed test scheme, and the PS for the proposed test scheme after compacting the POs with the technique presented in Section II.

#### B. Results for n-Detection

Although test vectors for full single stuck-at fault coverage may not detect all defects in a chip, experimental results



Fig. 9. Comparison of test quality of the proposed scheme with sts in terms of n-detection of single stuck-at faults.

show that high defect coverage can be achieved by applying n-detection test sets [35]. In our experiment, 500k clock cycles are used for n-detection based on the proposed weighted scanenable test scheme. That is, a fault is not dropped by a fault simulator until it is detected n times. We use n-detection ATPG to generate deterministic vectors that detect all hard faults at least n times. The HOPE fault simulator [27] is modified for n-detection fault simulation, whereas the ATALANTA test generator [26] is modified for n-detection ATPG.

Table V presents *n*-detection capability of the proposed scheme with that of the baseline sts test scheme. We report the fault coverage and the ROM storage needed (in bits). For the weighted scan-enable scheme, the proposed testability measure presented in Section IV is used to generate deterministic test vectors with fewer care bits. The seed encoding techniques presented in Section V are adopted to encode all deterministic vectors. For the baseline sts test scheme, the proposed new testability measure is not used to guide ATPG. An LFSR of size  $S_{\text{max}} + 20$  [22], [33], [40] is used to compress the deterministic vectors further into a set of seeds. The number of shift clock cycles between two loaded seeds for the scan-enable scheme is the same as that of the sts test scheme. As shown in Table V, the proposed scheme outperforms sts in all cases. The fault coverage difference between two methods and the difference in the required storage become more apparent when n increases. We use 500k clock cycles for all experimental results reported in Table V.

Experimental results show that the weighted scan-enable scheme achieves much higher fault coverage than the sts for all circuits for n = 1 to n = 5. Fig. 9 shows the comparison for circuits s38417, s15850, s13207, and s3330 from n = 1 to n = 5 and n = 10. It is shown that the weighted scan-enable signals lead to a much better fault coverage in all cases for the four circuits, particularly for s38417 and s3330. The proposed scheme needs much less ROM to store the deterministic seeds on-chip. The advantage of the proposed scheme is more significant for larger n.

#### VII. CONCLUSION

We have presented a new deterministic BIST method, which uses a reconfigurable scan forest with separate weighted scan-enable signals for all scan chains. The scan forest contains multiple scan trees, where the scan-in signal of each scan tree drives a number of scan chains without any aliasing. Test application is carried out in two phases: a pseudorandom phase with weighted scan-enable signals and a deterministic phase. The scan-forest architecture is configured as a single scan tree in the deterministic phase. Different LFSRs are used for both testing phases using a very simple control logic. A new testability measure has been proposed to generate deterministic test vectors with fewer care bits for the hard faults. We have presented experimental results and comparison with several previous methods. We have shown that the size of the LFSR used to encode all deterministic vectors is equal to the maximum number of care bits in the deterministic vectors. The proposed deterministic BIST scheme is also evaluated for the *n*-detection of single stuck-at faults. Experimental results show

that the proposed method performs better for the n-detection of single stuck-at faults than the conventional scan-based BIST scheme for n-detection testing. As part of ongoing work, we are investigating the use of multiple seeds per scan vector [45] in order to reduce the size of the LFSR.

#### REFERENCES

- [1] M. Abramovici, A. Breuer, and A. D. Friedman, *Digital Systems Testing* and Testable Design. Piscataway, NJ: IEEE Press, 1995.
- [2] V. D. Agrawal, C. R. Kime, and K. K. Saluja, "A tutorial on built-in self-test—Part 1: Principles," *IEEE Des. Test Comput.*, vol. 10, no. 1, pp. 73–82, Mar. 1993.
- [3] A. A. Al-Yamani, S. Mitra, and E. J. McCluskey, "Optimized reseeding by seed ordering and encoding," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 24, no. 2, pp. 264–270, Feb. 2005.
- [4] A. A. Al-Yamani, N. Nevta-Prasanna, E. Chmelar, M. Grinchuk, and A. Gunda, "Scan test cost and power reduction through systematic scan reconfiguration," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 26, no. 5, pp. 907–918, May 2007.
- [5] Astro: Advanced Place-and-Route Solution for SoC Design, Synopsys. [Online]. Available: http://www.synopsys.com/products /astro.html
- [6] S. Banerjee, D. R. Chowdhury, and B. B. Bhattatcharya, "An efficient scan tree design for compact test pattern set," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 26, no. 7, pp. 1331–1339, Jul. 2007.
- [7] P. H. Bardell, W. H. McAnney, and J. Savir, Built-in Test for VLSI: Pseudo-Random Techniques. New York: Wiley, 1987.
- [8] F. Brglez, D. Bryan, and K. Kozminski, "Combinational profiles of sequential benchmark circuits," in *Proc. IEEE Int. Symp. Circuits and Syst.*, 1989, pp. 1929–1934.
- [9] M. Bushnell and V. D. Agrawal, *Essentials of Electronic Testing*. Norwell, MA: Kluwer, 2000.
- [10] M. Chatterjee and D. K. Pradhan, "A BIST pattern generator design for near-perfect fault coverage," *IEEE Trans. Comput.*, vol. 52, no. 12, pp. 1543–1557, Dec. 2003.
- [11] H. Fujiwara and T. Shimono, "On acceleration of test generation algorithms," *IEEE Trans. Comput.*, vol. C-32, no. 12, pp. 1137–1144, Dec. 1983.
- [12] I. Hamzaoglu and J. H. Patel, "Reducing test application time for full scan embedded cores," in *Proc. IEEE Int. Symp. Fault-Tolerant Comput.*, 1999, pp. 260–267.
- [13] S. Hellebrand, J. Rajski, S. Tarnick, S. Venkataraman, and B. Courtois, "Built-in test for circuits with scan based on reseeding of multiple polynomial linear feedback shift registers," *IEEE Trans. Comput.*, vol. 44, no. 2, pp. 223–233, Feb. 1995.
- [14] S. Hellebrand, B. Reed, S. Tarnick, and H. J. Wunderlich, "Pattern generation for a deterministic BIST scheme," in *Proc. IEEE/ACM Int. Conf. Comput.-Aided Design*, 1995, pp. 88–94.
- [15] Y. Huang, I. Pomeranz, S. M. Reddy, and J. Rajski, "Improving the property of at-speed tests," in *Proc. IEEE/ACM Int. Conf. Comput.-Aided Design*, 2000, pp. 459–463.
- [16] A. Jas, C. V. Krishna, and N. A. Touba, "Weighted pseudo-random hybrid BIST," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 12, no. 12, pp. 1277–1283, Dec. 2004.
- [17] D. Kagaris, "Multiple-seed TPG structures," *IEEE Trans. Comput.*, vol. 52, no. 12, pp. 1633–1639, Dec. 2003.
- [18] R. Kapur, S. Patil, T. J. Snethen, and T. W. Williams, "A weighted random pattern test generation system," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 15, no. 8, pp. 1020–1025, Aug. 1996.
- [19] X. Kavousianos, E. Kalligerous, and D. Nikolos, "Multilevel Huffman coding: An efficient test data compression method for IP cores," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 26, no. 6, pp. 1070–1083, Jun. 2007.
- [20] G. Kiefer and H. J. Wunderlich, "Deterministic BIST with multiple scan chains," in *Proc. IEEE Int. Test Conf.*, 1998, pp. 1057–1064.
- [21] H. S. Kim, Y. Kim, and S. Kang, "Test-decompression mechanism using a variable-length multiple-polynomial LFSR," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 11, no. 4, pp. 687–690, Apr. 2003.
- [22] B. Koenemann, "LFSR-coded test patterns for scan designs," in *Proc. Eur. Test Conf.*, 1991, pp. 237–242.
- [23] B. Koenemann, C. Barnhart, B. Keller, T. Snethen, O. Farnsworth, and D. Wheater, "A smart BIST variant with guaranteed encoding," in *Proc. IEEE Asian Test Symp.*, 2001, pp. 325–330.

- [24] C. V. Krishna, A. Jas, and N. Touba, "Achieving high encoding efficiency with partial dynamic LFSR reseeding," ACM Trans. Des. Autom. Electron. Syst., vol. 9, no. 4, pp. 500–516, Oct. 2004.
- [25] L. Lai, J. H. Patel, T. Rinderknecht, and W. T. Cheng, "Logic BIST with scan chain segmentation," in *Proc. IEEE Int. Test Conf.*, 2004, pp. 57–66.
- [26] H. K. Lee and D. S. Ha, "On the Generation of Test Patterns for Combinational Circuits," Dept. Elect. Eng., Virginia Polytechnic Inst. State Univ., Blacksburg, VA, Tech. Rep. 12-93.
- [27] H. K. Lee and D. S. Ha, "HOPE: An efficient parallel fault simulator for synchronous sequential circuits," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 15, no. 9, pp. 1048–1058, Sep. 1996.
- [28] K. J. Lee, J. J. Chen, and C. H. Huang, "Using a single input to support multiple scan chains," in *Proc. IEEE/ACM Int. Conf. Comput.-Aided De*sign, 1998, pp. 74–78.
- [29] L. Li and K. Chakrabarty, "Test set embedding for deterministic BIST using a reconfigurable interconnection network," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 23, no. 9, pp. 1289–1305, Sep. 2004.
- [30] A. Majumdar, "On evaluating and optimizing weights for weighted random pattern testing," *IEEE Trans. Comput.*, vol. 45, no. 8, pp. 906–916, Aug. 1996.
- [31] F. Muradali, V. K. Agrawal, and B. Nadeau-Dostie, "A new procedure for weighted random built-in self-test," in *Proc. IEEE Int. Test Conf.*, 1990, pp. 660–668.
- [32] T. M. Niermann, W. T. Cheng, and J. H. Patel, "PROOFS: A fast, memory-efficient sequential circuit fault simulator," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 11, no. 2, pp. 198–207, Feb. 1992.
- [33] N. Oh, R. Kapur, and T. W. Williams, "Fast seed computation for reseeding shift register in test pattern compression," in *Proc. IEEE/ACM Int. Conf. Comput.-Aided Design*, 2002, pp. 76–81.
- [34] I. Pomeranz and S. M. Reddy, "3-weight pseudo-random test generation based on a deterministic test set for combinational and sequential circuits," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 13, no. 7, pp. 1050–1058, Jul. 1993.
- [35] I. Pomeranz and S. M. Reddy, "Improved *n*-detection test sequences under transparent scan," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 25, no. 11, pp. 2492–2501, Nov. 2006.
- [36] I. Pomeranz and S. M. Reddy, "Transparent scan: A new approach to test generation and test compaction for scan circuits that incorporates limited scan operation," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 22, no. 12, pp. 1663–1670, Dec. 2003.
- [37] Primitive Polynomial Generation Tool. [Online]. Available: http:// fchabaud.free.fr/English/default.phpq?COUNT=4&FILE0=&FILE1= Poly&FILE2=GF(2)&FILE3=Tables
- [38] J. Rajski, N. Tamarapalli, and J. Tyszer, "Automated synthesis of large phase shifters for built-in self-test," in *Proc. IEEE Int. Test Conf.*, 1998, pp. 1047–1056.
- [39] J. Rajski, J. Tyszer, and N. Zacharia, "Test data decompression for multiple scan designs with boundary scan," *IEEE Trans. Comput.*, vol. 47, no. 11, pp. 1188–1200, Nov. 1998.
- [40] J. Rajski, J. Tyszer, M. Kassab, and N. Mukherjee, "Embedded deterministic test," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 23, no. 5, pp. 776–792, May 2004.
- [41] J. Rajski, J. Tyszer, G. Mrugalski, W. T. Cheng, N. Mukherjee, and M. Kassab, "X-Press: Two-stage X-tolerant compactor with programmable selector," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 27, no. 1, pp. 147–159, Jan. 2008.
- [42] J. Savir, "Reducing the MISR size," *IEEE Trans. Comput.*, vol. 45, no. 8, pp. 930–938, Aug. 1996.
- [43] J. Savir, "Distributed generation of weighted random patterns," *IEEE Trans. Comput.*, vol. 48, no. 12, pp. 1364–1368, Dec. 1999.
- [44] M. A. Shah and J. H. Patel, "Enhancement of the Illinois scan architecture for use with multiple scan inputs," in *Proc. Int. Symp. VLSI*, 2004, pp. 167–172.
- [45] N. A. Touba, "Survey of test vector compression techniques," *IEEE Des. Test Comput.*, vol. 23, no. 4, pp. 294–303, Apr. 2006.
- [46] S. Tragoudas and V. Nagarandal, "On-chip embedding mechanisms for large sets of vectors for delay test," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 24, no. 3, pp. 488–497, Mar. 2005.
- [47] K. H. Tsai, J. Rajski, and M. Marek-Sadowska, "Star test: The theory and its applications," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 19, no. 9, pp. 1052–1064, Sep. 2000.
- [48] H. C. Tsai, K. T. Cheng, and S. Bhawmik, "On improving test quality of scan-based BIST," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 19, no. 8, pp. 928–938, Aug. 2000.

- [49] J. A. Waicukauski, E. Lindbloom, E. B. Eichelberger, and O. P. Forenza, "A method for generating weighted random test pattern," *IBM J. Res. Develop.*, vol. 33, no. 2, pp. 149–161, Mar. 1989.
- [50] S. Wang, "A BIST TPG for low power dissipation and high fault coverage," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 15, no. 7, pp. 777–789, Jul. 2007.
- [51] D. Xiang, M. J. Chen, J. G. Sun, and H. Fujiwara, "Improving test effectiveness of scan-based BIST using scan chain partitioning," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 24, no. 6, pp. 916–927, Jun. 2005.
- [52] D. Xiang, S. Gu, J. Sun, and Y. Wu, "A cost-effective scan architecture for scan testing with non-scan test power and test application cost," in *Proc. ACM/IEEE Des. Autom. Conf.*, 2003, pp. 744–747.
- [53] D. Xiang, M. J. Chen, and H. Fujiwara, "Using weighted scan enable signals to improve test effectiveness of scan-based BIST," *IEEE Trans. Comput.*, vol. 56, no. 12, pp. 1619–1628, Dec. 2007.
- [54] D. Xiang, K. Li, J. Sun, and H. Fujiwara, "Reconfigured scan forest for test application cost, test data volume, and test power reduction," *IEEE Trans. Comput.*, vol. 56, no. 4, pp. 557–562, Apr. 2007.



**Dong Xiang** (SM'04) received the B.S. and M.S. degrees in computer science from Chongqing University, Chongqing, China, in 1987 and 1990, respectively, and the Ph.D. degree in computer engineering from the Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China, in 1993.

He visited Concordia University, Montreal, QC, Canada, as a Postdoctoral Researcher, from 1994 to 1995, and the University of Illinois, Urbana– Champaign, Urbana, from 1995 to 1996. He also

visited Nara Institute of Science and Technology, Nara, Japan, as a Japan Society for the Promotion of Science Invitation Fellow from April to September 2003. He was an Associate Professor with the Institute of Microelectronics, Beijing, from October 1996 to March 2003. He is currently a Professor with the School of Software, Tsinghua University, Beijing. His research interests include design and test of digital systems, including design for testability, testing, test pattern generation, testability analysis, and built-in self-test, fault-tolerant computing, distributed computing, and computer networking. He authored *Digital Systems Testing and Design for Testability* (Science Press, 1997).

Dr. Xiang was the recipient of the National Outstanding Young Scientist Award from the National Science Foundation of China in 2004.



Yang Zhao received the B.E. degree in electronic engineering and the M.S. degree in computer science from Tsinghua University, Beijing, China, in 2004 and 2007, respectively. He is currently working toward the Ph.D. degree in electrical and computer engineering in the Department of Electrical and Computer Engineering, Duke University, Durham, NC.

His research interests include the design and testing of mixed-technology microsystems, electronic design automation, mixed-signal very large scale in-

tegration design, and microelectromechanical-system modeling and simulation.



Krishnendu Chakrabarty (S'92–M'96–SM'00– F'08) received the B.Tech. degree in computer science and engineering from the Indian Institute of Technology, Kharagpur, India, in 1990, and the M.S.E. and Ph.D. degrees in computer science and engineering from the University of Michigan, Ann Arbor, in 1992 and 1995, respectively.

He is currently a Professor with the Department of Electrical and Computer Engineering, Duke University, Durham, NC. His current research projects include testing, and design-for-testability of system-

on-chip integrated circuits; digital microfluidic biochips; logic circuits based on DNA self-assembly; and delay-tolerant wireless networks. He has authored five books, namely, *Microelectrofluidic Systems: Modeling and Simulation* (CRC, 2002), *Test Resource Partitioning for System-on-a-Chip* (Kluwer, 2002), *Scalable Infrastructure for Distributed Sensor Networks* (Springer, 2005), Digital Microfluidics Biochips: Synthesis, Testing, and Reconfigutaion Techniques (CRC, 2006), and Adaptive Cooling of Integrated Circuits using Digital Microfluidics (Artech House, 2007), and he has edited the book volumes SOC (System-on-a-Chip) Testing for Plug and Play Test Automation (Kluwer, 2002) and Design Automation Methods and Tools for Microfluidics-Based Biochips (Springer, 2006). He has contributed 15 invited chapters to book volumes, published 270 papers in archival journals and refereed conference proceedings, and delivered over 100 keynote, plenary, and invited talks. He is the holder of a U.S. patent in built-in self-test and is a Coinventor of a pending U.S. patent on sensor networks.

Prof. Chakrabarty is a Senior Member of the Association for Computing Machinery (ACM) and a member of Sigma Xi. He served as a Distinguished Visitor of the IEEE Computer Society from 2005 to 2007 and as a Distinguished Lecturer of the IEEE Circuits and Systems Society from 2006 to 2007. Beginning 2008, he serves as an ACM Distinguished Speaker. He is an Associate Editor of the IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, the IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, the IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS, and the ACM Journal on Emerging Technologies in Computing Systems. He is also an Editor of the IEEE DESIGN AND TEST OF COMPUTERS and the Journal of Electronic Testing: Theory and Applications. He recently completed his term as an Associate Editor of the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEM I (2006-2007). In the recent past, he has also served as an Associate Editor for the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEM II. He served as the Program Chair for the IEEE Asian Test Symposium in 2005 and the CAD, Design, and Test Conference for the 2007 IEEE Symposium on Design, Integration, Test, and Packaging of MEOMS, and he is the designated Program Vice Chair for the 2008 International Mixed-Signals, Sensors, and Systems Test Workshop. He was the recipient of the National Science Foundation Early Faculty (CAREER) Award and the Office of Naval Research Young Investigator Award. He was the recipient of Best Paper Awards at the 2007 IEEE International Conference on VLSI Design, the 2005 IEEE International Conference on Computer Design, and the 2001 IEEE Design, Automation and Test in Europe Conference. He received the IEEE Meritorious Service Award in 2008. He was also the recipient of the Humboldt Research Fellowship Award from the Alexander von Humboldt Foundation, Germany, and the Mercator Visiting Professorship Award from the Deutsche Forschungsgemeinschaft, Germany.



Hideo Fujiwara (S'70–M'74–SM'83–F'89) received the B.E., M.E., and Ph.D. degrees in electronic engineering from Osaka University, Osaka, Japan, in 1969, 1971, and 1974, respectively.

He was with Osaka University from 1974 to 1985 and with Meiji University, Tokyo, Japan, from 1985 to 1993. In 1981, he was a Visiting Assistant Professor at the University of Waterloo, Waterloo, ON, Canada, and in 1984, he was a Visiting Associate Professor at McGill University, Montreal, QC, Canada. Since 1993, he has been with the Nara

Institute of Science and Technology, Nara, Japan, where he is currently a Professor with the Graduate School of Information Science. His research interests include logic design, digital system design and test, very large scale integration (VLSI) computer-aided design, and fault-tolerant computing, including high-level/logic synthesis for testability, test synthesis, design for testability, built-in self-test, test pattern generation, parallel processing, and computational complexity. He has published over 300 papers in refereed journals and conferences and nine books, including the book *Logic Testing and Design for Testability* (MIT Press, 1985).

Dr. Fujiwara is a Golden Core member of the IEEE Computer Society. a Fellow of the Institute of Electronics, Information and Communication Engineers (IEICE) of Japan, and a fellow of the Information Processing Society of Japan. He served as an Editor of the IEEE TRANSACTIONS ON COMPUTERS from 1998 to 2002, the Journal of Electronic Testing: Theory and Application from 1989 to 2004, the Journal of Circuits, Systems and Computers from 1989 to 2004, and the VLSI Design: An Application Journal of Custom-Chip Design, Simulation, and Testing from 1992 to 2005. He also served as a Guest Editor for several special issues of the IEICE Transactions of Information and Systems. He is currently an advisory member of the IEICE Transactions on Information and Systems. He was the recipient of the IECE Young Engineer Award in 1977, the IEEE Computer Society Certificate of Appreciation Awards in 1991, 2000, and 2001, the Okawa Prize for Publication in 1994, the IEEE Computer Society Meritorious Service Awards in 1996 and 2005, the IEEE Computer Society Continuing Service Award in 2005, and the IEEE Computer Society Outstanding Contribution Award in 2001.