【正文】
y available from 11 In order to make meaningful FPGA architecture parisons, it is essential that the CAD tools used to map circuits into each architecture are of high quality. The routing phase of VPR outperforms all previously published FPGA routers for which standard benchmarks results are available, and that the bination of VPR’s placer and router outperforms all published binations of FPGA placement and routing The organization of this paper is as follows. In Section 2 we describe some of the features of VPR and the range of FPGA architectures with which it may be used. In Sections 3 and 4 we describe the placement and routing algorithms. In Section 5, we pare the number of tracks required by VPR to successfully route circuits with that required by other published tools. In Section 6 we conclude and outline some future enhancements which will be made to VPR. 2 Overview of VPR Figure 1 outlines the VPR CAD flow. The inputs to VPR consist of a technologymapped list and a text file describing the FPGA architecture. VPR can place the circuit, or a preexisting placement can be read in. VPR can then perform either a global route or a bined global/detailed route of the placement. VPR’s output consists of the placement and routing, as well as statistics useful in assessing the utility of an FPGA architecture, such as routed wirelength, track count, and maximum of the architectural parameters that can be specified in the architecture description file are: ? the number of logic block inputs and outputs, ? the side(s) of the logic block from which each input and output is accessible, ? the logical equivalence between various input and output pins (. all LUT inputs are functionally equivalent), ? the number of I/O pads that fit into one row or one column of the FPGA, and ? the dimensions of the logic block array (. 23 x 30 logic blocks). In addition, if global routing is to be performed, one can also specify: ? the relative widths of horizontal and vertical channels, and ? the relative widths of the channels in different regions of the FPGA. Finally, if bined global and detailed routing is to be performed, one also specifies: ? the switch block [1] architecture (. how the routing tracks are interconnected), ? the number of tracks to which each logic block input pin connects (Fc [1]), ? the Fc value for logic block outputs, and ? the Fc value for I/O pads. The current architecture description format does not allow segments that span more 12 than one logic block to be included in the routing architecture, but we are presently adding this feature. Adding new routing architecture features to VPR is relatively easy, since VPR uses the architecture description to create a routing resource routing track and every pin in the architecture bees a node in this graph, and the graph edges represent the allowable connections. The router, graphics visualization and statistics putation routines all work only with this routing resource graph, so adding new routing architecture features only involves changing the subroutines that build this VPR was initially developed for islandstyle FPGAs [2, 3], it can also be used with rowbased FPGAs [4]. VPR is not currently capable of targeting hierarchical FPGAs [5], although adding an appropriate placement cost function and the required routing resource graph building routines would allow it to target , VPR’s builtin graphics allow interactive visualization of the placement, the routing, the available routing resources and the possible ways of interconnecting the routing resources. The VPACK Logic Block Packer / Netlist Translator VPACK reads in a blif format list of a circuit that has been technologymapped to LUTs and flipflops, packs the LUTs and flip flops into the desired FPGA logic block, and outputs a list in VPR’s list format. VPACK can target a logic block consisting of one 13 LUT and one FF, as shown in Figure 2, as this is a mon FPGA logic element. VPACK is also capable of targeting logic blocks that contain several LUTs and several flip flops, with or without shared LUT inputs [6]. These “clusterbased”logic blocks are similar to those employed in recent FPGAs by Altera, Xilinx and Lucent Technologies. 2 Placement Algorithm VPR uses the simulated annealing algorithm [7] for placement. We have experimented with several different cost functions, and found that what we call a linear congestion cost function provides the best results in a reasonable putation time [8].The functional form of this cost function is where the summation is over all the s in the circuit. For each , bbx and bby denote the horizontal and vertical spans of its bounding box, respectively. The q(n)factor pensates for the fact that the bounding box wire length model underestimates the wiring necessary to connect s with more than three terminals, as suggested in [10]. 14 Its value depends on the number of terminals of n。 Dlimit 163。 if a circuit has not successfully routed in a given number of tracks in 45 iterations it is assumed to be unroutable with channels of that width. To avoid overly circuitous routes and to save CPU time, we allow the routing of a to go at most 3 channels outside the bounding box of the important implementation detail deserves mention. Both the original Pathfinder algorithm and VPR’s router use Dijkstra’s algorithm (. a maze router [15]) to connect each . For a k terminal , the maze router is invoked k1 times to perform all the required connections. In the first invocation, the maze routing wavefront expands out from the source until it reaches any one of the k1 sinks. The path from source to sink is now the first part of this ’s routing. The maze routing wavefront is emptied, and a new wavefront expansion is started from the entire