# Design of High-Speed Image Transmission Board Based on PCI-Express

Yongxian Yu<sup>1,a</sup>, Xuwen Li<sup>2,b</sup> and Qiang Wu<sup>3,c</sup>

<sup>1,3</sup>College of Information and Communication Engineering, Faculty of Information Technology,

Beijing University of Technology, Beijing, 100124, China

<sup>2</sup>College of Life Science and Bioengineering, Beijing University of Technology

Beijing, 100124, China

<sup>a</sup>email: yuyongxian911@163.com, <sup>b</sup>email: lixuwen@bjut.edu.cn, <sup>c</sup>email: wuqiang@bjut.edu.cn

Keywords: FPGA, Kintex7, PCI-Express, DDR3 SDRAM, LVDS

**Abstract.** This paper designs and implements a high speed image transmission board of external DDR3 SDRAM based on PCI-Express to meet the requirement of high frame rate in high speed image data transmission system. Transmission board uses XILINX Kintex7 series chip XC7K70T as the master chip, through the PCI-Express DMA logic for data exchange with the host computer, and uses external DDR3 cache to solve the problems of high frame rate and real-time in large amount of data transmission. Transmission board uses LVDS high-speed differential interface as an external interface with the design of a set of LVDS interface timing. In the experiment, the gray scale visible image with 1024\*1025 byte size is transmitted and tested. It can verify the transmission rate of 50 frames per second respectively, and the data transmission is stable and reliable, which can provide some reference value for high speed data transmission system.

# Introduction

In recent years, with the rapid development of electronic technology, image information processor has been widely used in aerospace, military, medical and other fields. At the same time, the corresponding image transmission system has also been widely used and developed by leaps and bounds. As an important component of the image information processing evaluation test system, the image transmission rate, accuracy and stability of the image information processor have a significant impact on the assessment results. With the emergence of high-resolution visible light images, a complete set of image transmission system to send high-resolution images at the same time, but also to meet the requirements of high frame rate.

This paper designs a high-speed image transmission board based on PCI-Express. The board adopts LVDS interface as the system external interface, and realizes the high-speed transmission of image sequence in real time to complete the simulation test of image demand equipment. As the characteristics of large amount of data and high rate in image transmission, the board chooses PCI-E bus as the interface of PC, which is not only simple, but also has the ten times transmission speed of the PCI bus[1]. In the image sending system, if there is no external memory, its internal storage space is not enough to store one frame image, and the FIFO packet transmission method will increase the transmission time of the image on the link and cannot solve the problem of frame rate and real-time. To solve the above problem, DDR3 is used as an image buffer, which stored the transmission of the image first, and then read out to transmit of each frame.

# System Design

The image transmission board designed in this paper mainly includes sections of LVDS interface, PCI-E bus, external DDR3 memory and the core logic design of FPGA. The section of LVDS interface is composed of the isolation chip, LVDS transmitter chip MAX9247 and related circuits,



which ensures that the transmitted image data of FPGA is finally output by the LVDS interface[2]. The section of PCIE bus uses gold finger connector so that it can be inserted into the integrated chassis, and the rate of PCIE x1 line is 250MB/s, which safeguards high-speed data transmission between the PC and FPGA. Using DDR3 as an external memory, not only can reduce the use of FPGA RAM resources, but also to meet the requirements of high-speed data transmission. The main function of the FPGA on the transmission board is to cache the visible image from the PC-side via PCI-E, and finally send it through the LVDS interface.

LVDS image transmission board using the FPGA model Kintex7 series XC7K70T of XILINX, which guarantees a sufficient design resources and support for high-speed interface PCI-E bus. The high-speed image data and instruction transmission between FPGA and the CPU system board is realized by the PCI-E bus. Internal logic design of LVDS image transmission board shown in Fig 1, including PCIE interface module, LVDS interface module, DDR3 controller module, write-FIFO module, read-FIFO module. PCIE interface module is not only to achieve the PCIE interface, and to achieve interaction with the host computer instructions through a register module. The LVDS interface module acquires the image data from the read-FIFO, and outputs with synchronization signal, such as frame synchronization, field synchronization, and line synchronization, according to the timing requirement. DDR3 controller module, write-FIFO module, read-FIFO module together to form a data cache part, that DDR3 controller is used to control timing and processes of DDR3 and write-FIFO and read-FIFO are used as cache medium between DDR3 module with PCIE interface module.



Fig 1.LVDS Image transmission board internal logic block diagram

### **PCI-E Interface Module Design**

The PCI-E module is designed with reference to XILINX's official design: Using the Endpoint Block IP core to generate PCI-EXPRESS interface logic, and refer to the official project xapp1052 to achieve the DMA transfer design. The Kintex-7 family of FPGAs provides a PCIE core with an integrated module type for PCIE interfaces, which has scalable, high-bandwidth features and a serial interconnect module. As with the previous PCIE IP core, the IP core also has 4 channels (1-lane, 2-lane, 4-lane, and 8-lane) Endpoint and Root Port, but its configuration speed of up to 5Gb / s, supporting second-generation interface speed [3].

The entire PCI-E sending logic consists of seven series Endpoint core, AXI-TRN bus switching bridge, DMA control logic module and register operation logic module. The structure of the PCI-E is shown in Fig 2. Unlike the previous PCIE IP core, the Kintex-7 series uses the AXI bus interface instead of the TRN bus interface, but the project provided by xapp1052 uses the TRN bus interface, so an AXI-TRN bus interface conversion bridge is designed to transform TRN bus interface into AXI bus interface as Kintex-7 PCIE IP core can be driven normally by user interface logic.





Fig 2.Endpoint block logic interface

Endpoint kernel is equivalent to PCI bridge chip, which main role is the interface conversion, mapping IO space and configuration space. The operation of the Endpoint kernel is divided into register read and write and DMA transfer, that register operation logic is used to generate state machine related to register read and write, and DMA control logic is used to generate state machine related to DMA transfer[4].

The register operation logic opens up the area with the offset address from 0x00 to 0x7f, and each address corresponds to a 32bit register. The register is mainly used to configure the DMA transfer and interface type selection. Before the DMA transfer begins, it is necessary to configure the TLP packet length, the number of packets, and the corresponding address of the first address of the DMA buffer. After the transmission data is ready, the control register start the DMA transfer.

DMA transfer uses the Endpoint IP core data interface, and the transmission system only need to set the buffer system first address, transmission size and other parameters, than the image data move at high speed under the control of DMA controller. In DMA mode, FPGA sends a bulk read request to PC via s\_axis\_tx bus interface and then the DMA on PC side sends back data via m\_axis\_rx bus interface. The DMA transmission timing is shown in Fig 3. It can be seen that the transmission of data is not continuous, and this is because the transmission mode of this PCI-E is transmitted according to the packet. PCI-Express bus data transmission theoretical bandwidth is 0.5GB/s. But the system bandwidth is 250MB/s due to the continuous between the data frames, which still meets the needs of high-speed image data transceiver.



Fig 3.PCI-Express DMA Transmission timing

#### **DDR3** Control Module Design

DDR3 control unit uses XILINX MIG IP core to generate DDR3 controller, which provides a simple user interface to complete the DDR3 read and write operations through the user interface signal, shortening the user's logic development cycle of DDR3 [5].

The write operation timing of DDR3 controller is shown in Fig 4, app\_rdy signal is active always with high level, and app\_cmd is a write instruction with the data of app\_wdf\_data writing into app\_addr valid when the signal of app\_en, app\_wdf\_rdy and app\_wdf\_wren are both high levels. The read operation timing of DDR3 controller is shown in Fig 5, app\_rdy signal is active always with high level, and app\_cmd is a read instruction with the data reading from app\_addr when app\_en is high level. It can be seen that read operation of DDR3 controller has a certain delay, as need to use app\_rd\_data\_valid to determine whether the data is valid.





Fig 5.Read timing of DDR3 controller

In order to facilitate the interface with the PCIE interface, LVDS interface, using the write FIFO and read FIFO as a DDR3 controller cache unit, as shown in Fig 6. The read and write control process of DDR3 is that PC to send a frame of data written by WR\_FIFO firstly and start to read data sending to LVDS interface module by the RD\_FIFO when a frame is full. WR\_FIFO takes non-empty read mode of operation as the cache media between PCI-E interface and DDR3 interface. Start DDR3 read operation when the write address reaches the end of a frame image and finish at the time of the read address reaches the end of a frame DDR3 read address addr\_rd is accumulated according to ddr\_rd\_valid signal, and when the DDR3 read is completed, send an interrupt request to the PC for beginning to send the next frame image.



Fig 6.DDR3 Read and write control module

To verify the correctness of the DDR3 read and write control module, write a frame of image data to DDR3 and then use the Chipscope tool in XILINX ISE software. The actual DDR3 interface timing is shown in Fig 7. It can be seen from Fig 7, DDR3 controller interface signal are in line with the timing requirements in the DDR3 read and write operations. Compare the read data with the write data and verify that the DDR3 read and write control module can work correctly.



Fig 7.Hardware simulation DDR3 controller interface timing diagram





# **LVDS Interface Module Design**

LVDS interface using MAX9247 as a transmitter, the image signal to be transmitted contains 16-bit parallel data D [15: 0], synchronization signal DE, frame synchronization signal VSYN and line synchronization signal HSYN with 35Mhz synchronous clock. Visible light image LVDS interface timing is shown in Fig 8, a frame image includes 1025 lines as each line has 512 valid data, and the data is valid when the signals of DE, VSYN and HSYN are both set high level.



Fig 8.Visible light image LVDS interface timing

According to the LVDS interface timing logic, a state machine is designed to generate the DE signal, frame synchronization VSYN and line synchronization HSYN signal which meet the timing requirements. It uses eight states to describe the whole sequence, and realizes the state jump according to the timing relationship between signals of DE, VSYN and HSYN. The state machine is designed as shown in Fig 9.



Fig 9.LVDS Interface state machine

# System test

Use integrated chassis as the image LVDS transmission board carrier for easy to carry and test, the entire system is shown in Fig 10. The visible image is tested with 1024 \* 1025byte grayscale image, and the receiving device receives 100 frames of images, which is the same as the number of frames sent. After resolution, the image data is not abnormal.



Fig 10.High - speed image transmission system

The test results are shown in Table 1. After repeated tests, the frame interval of the visible image can reach 20ms, and the design reaches the expected target.



| Image type    | Number of frames | Time(ms) | Frame interval(ms) |
|---------------|------------------|----------|--------------------|
| Visible Light | 100              | 2000     | 20                 |
|               | 1000             | 20000    | 20                 |
|               | 5000             | 100000   | 20                 |
|               | 10000            | 200000   | 20                 |
|               | 50000            | 1000000  | 20                 |
|               | 100000           | 2000000  | 20                 |

Table 1. LVDS Interface Rate Test

## Conclusion

In this paper, a high-speed image transmission board is designed and implemented. The LVDS high-speed differential interface is used as the image data channel in the image transmission. The PCIE DMA logic ensures the high-speed transmission of the batch data of the image data and the PCIE bus. The design logic of the DDR3 user interface makes DDR3 a cache unit in the system, which solves the problem of poor real-time transmission. After testing and verification, the board designed in this paper can meet the requirement of 50 frames per second of the visible image, and the transmission image is stable and reliable, which can meet the demand of high speed image data transmission and has certain application prospect in image transmission.

## References

- [1] S.K. Dhawan: Nuclear Science Symposium Conference Record, IEEE, Vol. 2 (2005), p. 687-691.
- [2] M. Chen, J. Silva-Martinez, M. Nix, et al: IEEE journal of solid-state circuits, Vol. 40(2) (2005), p. 472-479.
- [3] Xilinx Instruments: 7 Series FPGAs Integrated Block for PCI Express Product Guide, PG054 (2014)
- [4] L. Rota, M. Caselle, S. Chilingaryan, et al: IEEE Transactions on Nuclear Science, Vol. 62(3) (2015), p. 972-976.
- [5] Y. Wenhua, and H. Li.: FPGA Based DDR3 Applications in a Multichannel Channelization Data Cache, Computational Intelligence and Design (ISCID), 2016 9th International Symposium on. IEEE, Vol. 1 (2016), p. 54-57.