# Clock Tree Synthesis and Optimization of SoCs under Low Voltage He Xin<sup>1,a</sup>, Huang Xu<sup>1,b</sup>, Yang Wu<sup>1,c</sup> and Li Yujing<sup>1,d</sup> <sup>1</sup>Sichuan Institute of Solid State Circuits, Chongqing, P.R. China <sup>a</sup>letter1988@163.com, <sup>b</sup>hxtt103@163.com, <sup>c</sup>yangwu@cetc.cn, <sup>c</sup>liyujing@cetc.com **Keywords:** Clock Tree Synthesis, Low Voltage, PVT variation **Abstract:** With the development of integrated circuits, the power consumption becomes a key problem in the design of integrated circuits. Reducing the operating voltage is an effective way to reduce power consumption. But the chip working voltage is reduced, brings more challenges to the chip design, which is mainly composed of process, voltage and temperature (PVT) variation instability on the performance of chips. According to the problem of clock tree network structure under low voltage, this paper puts forward a method of resistance process, voltage and temperature (PVT) variation clock tree design under low voltage, enhancing the stability of the chip, and ensuring the effectiveness of low voltage design. #### 1. INTRODUCTION With the application of digital integrated circuit is more and more extensive, and the performance is more and more outstanding, the power consumption of the chip becomes the key problem of the development of digital integrated circuit. Especially for the battery powered electronic equipment, the battery life and power consumption has a direct relationship. As the development of battery technology is far slower than the chip manufacturing technology, in order to reduce the size of the device, and to extend the use of equipment, it is necessary to reduce the energy consumption of the circuit to very low. Therefore, the design of low energy consumption can be achieved with low power supply voltage technology. Reduce the working voltage of the chip, brings more challenges to the chip design, which is mainly composed of process, voltage and temperature (PVT) variation instability on the performance of chips. According to the above requirements, this paper points out the problem of clock tree structure under low voltage, and gives the resist PVT deviation of the clock tree optimization method, so as to ensure that the threshold change brought by PVT deviation under the low voltage does not make the clock skew changes greatly, so as to improve the stability of the chip, to ensure the effectiveness of low voltage design. #### 2. CLOCK TREE SYNTHESIS #### 2.1. CLOCK NETWORK The clock signal in the digital integrated circuit design is the benchmark for data transmission, playing a decisive role in function, performance and stability of synchronous digital system[1], so the characteristics of clock signal and its distribution network is very important to design a chip component. The clock signal is generally has the biggest fan out, the longest distance, the highest speed in the entire chip. In the process of SoC design based on standard cell, the clock network design is usually carried out in the physical design stage. The common method is the clock tree synthesis, as shown in Figure 1, so it is one of the key steps in SoC physical design. The clock signal latency is also known as the insertion delay, which includes clock source insertion delay and clock network insertion delay. In the clock tree synthesis, the value of the clock latency will be used directly to calculate and fix the deviation. The essential reason of the clock network delay is the large load capacitance of the clock network. If the clock point is directly connected to each clock endpoint, it will cause the clock signal transition time is long, cannot meet the system requirements for the operating frequency. Figure 1 clock tree synthesis The clock skew is the difference between the delays of the clock signal arriving at different timing units in the same clock domain. The deviation of clock signal is an important parameter to measure the performance of clock tree. The purpose of clock network design is to reduce the deviation. In the physical design of the chip, the physical location of the timing unit is different, which makes the clock line length different. That is, the time from the root node of the clock to several leaf nodes is different. In large scale integrated circuit design, the large area distribution of tens of thousands of leaf nodes will cause the deviation value is very large, resulting in the loss of chip performance. Clock signal transition time, also known as the clock signal slew, refers to the time required of the process of changing the clock signal, to achieve the voltage level. # 2.2. CLOCK TREE UNDER LOW VOLTAGE In order to make the low power system achieve good functional stability under low voltage, the design of clock network is more important. As the most frequently used network in the flip chip, the power consumption of the clock network has been increased to 40%[2]. Due to its low power consumption, clock tree is widely used in low power design. With the decrease of the voltage, the low voltage circuit becomes more sensitive to the variation of the process, voltage and temperature. In the process of fabrication, due to the random distribution of doping, the number of transistors in different channels is different. Especially in the process of the development to the depth of submicron level, the impurity atom of each transistor channel is very small, hundreds or even dozens of atoms will lead to a large difference in the threshold voltage deviation. When the operating voltage is reduced, the threshold deviation will lead to a significant increase in the equivalent resistance of the transistor, the delay of the clock network and the deviation of the clock network. At the same time, because of the instability of the clock network, it will lead to the function of the chip to make a mistake. Due to the increase of the propagation delay of the device, the delay of the circuit is reduced, and the slew and skew on the clock tree are displayed in the time of conversion and the device delay. At this time, the clock tree whose skew meets the requirement works under low voltage, as the delay increases, the clock tree skew will increase. Figure 2 Synchronous circuit data path Figure 2 shows the schematic diagram of the synchronous circuit data path, the shortest clock period Tmin or the maximum frequency of the circuit is fmax: $$Tmin=TCLK-Q+TLogic-max+Tsetup+skew$$ (1) $$fmax=1/Tmin (2)$$ TLogic-max is the maximum delay for combinational logic. First, the increase of clock skew enables the minimum period to be increased, thus the main frequency decreases. Second, the slew of register clock terminal directly affects the register propagation delay ( TCLK-Q ) and the setup time (Tsetup ), the increase of slew will also lead to a decline in frequency. At the same time, slew also affects the hold time and internal power consumption. Under normal voltage, the longer interconnect lines contribute a large parasitic resistance and capacitance, usually by inserting the buffers to optimize the delay. For a typical tree RC network, the Elmore model is used to calculate the network delay: $$\tau D = \sum RikCk = R1C1 + (R1 + R2)C2 + (R1 + R3)C3 + (R1 + R3)C4 + (R1 + R3 + Ri)Ci$$ (3) It can be seen that, in the low voltage, ignoring the line resistance, most of the clock network latency is the device transition delay. By inserting the multi-level buffers the delay will not be optimized. Buffer's role is only to optimize the slew of the sink ends, transferring the slew to buffer's input slew. # 2.3. RESISTANCE TO PVT VARIATION OF CTS In the resist PVT variation clock structure under low voltage, considering the reduction in the proportion of line delay, the delay is mainly the propagation delay. Under normal voltage, because the interconnect propagation delay caused by interconnect resistance, and the device delay proportion is proportional to the line length, and the interconnect delay proportion is more than the buffer unit propagation delay. Thus, in the case of slew satisfying constraints, changing the size of the buffer unit on the clock tree, optimizing the propagation delay of the buffer unit, cannot significantly optimize the clock delay. However, in the chip work under low voltage, the device propagation delay occupies more than 80% of the whole network clock delay, thus increasing the size of the device will reduce the propagation delay device linearly, which reduce the network delay clock greatly. Therefore, in the clock tree structure under low voltage, the device selection uses the maximum driving unit. In this way, under the same slew constraints, the device can drive more load cells, and effectively reduce the number of buffer units and the clock tree levels. The propagation delay of each buffer unit is affected by the PVT variation, and the delay of each level is propagated to the next stage. When the circuit is working at a certain time, considering the load capacitance and the pin capacitance, the deviation of the clock tree driven by the same size unit is usually the maximum load and the minimum load. After considering the process deviation, the deviation of the threshold voltage will lead to the clock skew move to the next level. This will merge branches with a large size, drive to replace, the effect of threshold voltage jitter on clock bias will reduce the level of threshold voltage of the driving of large size will be on the impact of clock network delay. At this time, merging the branches and using the large size driving, the effect of threshold voltage jitter on clock skew will be reduced one level. The influence of threshold voltage on large size drive will be changed to the clock network delay. In the clock tree synthesis process, the delay can be optimized is the insertion delay of the clock network. At the same time, in the clock tree structure with resist PVT variation under low voltage, the clock network insertion delay is the main caused by the buffer unit driving the large capacitance load. Therefore, the optimization of clock delay has been taken into account in Pre-CTS layout optimization. By using clustering optimization of the layout, the clock unit connected with a clock line as possible as together, making the clock wiring length is not too long, to reduce the interconnect load capacitance. Therefore, to some extent, the effect of optimizing clock delay is achieved. # 3. DESIGN AND ANALYSIS The experimental circuit is based on Synopsys company's mainstream EDA tools, by using the 40nm technology in low voltage 0.5V. Using IC Compiler to complete the generation and routing of clock tree structure, H-spice and StarRC extracted the clock network spice netlist and clock network parasitic parameters to generate H-spice simulation environment. Under low voltage, the clock tree structure is based on the bottom-up clock tree synthesis, in addition to verify the performance of its ideal state, but also need to verify its ability to resist PVT variation. By comparing the clock tree structure achieved by EDA tools, the clock tree structure using LVT library and the clock tree structure with resist PVT deviation, verified the performance of the clock tree with resist PVT deviation under low voltage. | | EDA auto | with LVT cells | Resistance to PVT variation | | | | | |-----------------|----------|----------------|-----------------------------|--|--|--|--| | register number | 3785 | | | | | | | | CGC number | 163 | | | | | | | | clock level | 13 | 12 | 10 | | | | | | inserted buffer | 272 | 195 | 123 | | | | | Table 1 Comparison of clock tree structure After the simulation analysis of the clock network, through the MATLAB simulation of the delay to fit, then get the Gauss distribution and the mean and standard deviation (SD) of the distribution. | | _ | | _ | | | _ | |-----------|----------|-------|----------------|-------|-----------------------------|-------| | | EDA auto | | with LVT cells | | Resistance to PVT variation | | | | mean | SD | mean | SD | mean | SD | | max delay | 3.356ns | 721ps | 3.125ns | 689ps | 2.832ns | 657ps | | min delay | 2.981ns | 630ps | 2.802ns | 601ps | 2.584ns | 574ps | | skew | 1.245ns | 292ps | 1.037ns | 266ps | 0.755ns | 216ps | Table2 Comparison of clock performance under low voltage Experimental data can be seen, using LVT cell under low voltage, the mean value of clock skew is reduced from 1.245ns to 1.037ns, and the standard deviation of the distribution is reduced from 292ps to 266ps, optimal range was 10%. It is shown that the use of LVT cell libraries instead of EDA using the RVT unit will enhance the ability of the clock tree structure to resist PVT variation. Then using the clock tree structure with resist PVT variation, the mean value of clock skew is reduced from 1.245ns to 0.755ns, and the standard deviation of the distribution is reduced from 292ps to 216ps, optimal range was 27%. The design of the clock tree structure in this paper will further enhance the ability of the clock network to resist PVT deviation. #### 4. CONCLUSIONS This paper, through the comparison of the clock tree structure achieved by EDA tools, the clock tree structure using LVT library and the clock tree structure with resist PVT deviation, verified the performance of the clock tree with resist PVT deviation under low voltage. From the above results, it can be seen that under low voltage condition, the optimizations of the cluster optimization in the layout process, the increase of the device size, and the optimization of the clock skew by the merging of the clock tree branches are significant. # References - [1] Sung-Mo Kang, Yusuf Leblebici. CMOS Digital Integrated Circuits Analysis and Design. Third Edition [M]. 2003. - [2] B.Calhoum, et al. Characterizing and Modeling Minimum Energy Operation for Subthreshold Circuits[C]. In IEEE International Symposium on Low Power Electronics and Design. 2004. - [3] N. Mogen, et al. Interconnect power dissipation in a Microprocessor[C] .Int. Workshop on SLIP, 2004. - [4] Xin Zhao, Tolbert, J.R., Mukhopadhyay, S., Sung Kyu Lim. Variation-Aware Clock Design Methodology for Ultra-low Voltage (ULV) Circuits. Computer-Aided Design of Integrated Circuits and Systems[C]. IEEE Transactions on. 2012. - [5] Seok Mingoo, Blaauw D, Sylvester D. Clock network design for ultra-low power applications[C]. Low-Power Electronics and Design, ACM/IEEE International Symposium on. 2010.