【正文】
. If the connection box between an input pin and the tracks to which it connects consists of Fc independent pass transistors controlled by Fc SRAM bits, it is possible to turn on two of these switches in order to electrically connect two tracks via the input pin. We will refer to this as an input pin dogleg. Commercial FPGAs, however, implement the connection box from an input pin to a channel via a multiplexer, so only one track may be connected to the input pin. Using a multiplexer rather than independent pass transistors saves considerable area in the FPGA layout. As well, normally there is a buffer between a track and the connection block multiplexers to which it connects in order to improve speed。 q is 1 for s with 3 or fewer terminals, and slowly increases to for s with 50 , x(n) and Cav, y(n) are the average channel capacities (in tracks) in the x and y directions, respectively, over the bounding box of cost function penalizes placements which require more routing in areas of the FPGA that have narrower channels. All the results in this paper, however, are obtained with FPGAs in which all channels have the same capacity. In this case Cav is a constant and the linear congestion cost function reduces to a bounding box cost good annealing schedule is essential to obtain highquality solutions in a reasonable putation time with simulated annealing. We have developed a new annealing schedule which leads to very highquality placements, and in which the annealing parameters automatically adjust to different cost functions and circuit sizes. We pute the initial temperature in a manner similar to [11]. Let Nblocks be the total number of logic blocks plus the number of I/O pads in a circuit. We first create a random placement of the circuit. Next we perform Nblocks moves (pairwise swaps) of logic blocks or I/O pads, and pute the standard deviation of the cost of these Nblocks different configurations. The initial temperature is set to 20 times this standard deviation, ensuring that initially virtually any move is accepted at the start of the in [12], the default number of moves evaluated at each temperature is. This default number can be overridden on the mand line, however, to allow different CPU time / placement quality tradeoffs. Reducing the number of moves per temperature by a factor of 10, for example, speeds up placement by a factor of 10 and reduces final placement quality by only about 10%.When the temperature is so high that almost any move is accepted, we are essentially moving randomly from one placement to another and little improvement in cost is obtained. Conversely, if very few moves are being accepted (due to the temperature being low and the current placement being of fairly high quality), there is also little improvement in cost. With this motivation in mind, we propose a new temperature update schedule which increases the amount of time spent at temperatures where a significant fraction of, but not all, moves are being accepted. A new temperature is puted as Tnew = a Told, where the value of a depends on the fraction of attempted moves that were accepted (Raccept) at Told, as shown in Table , it was shown in [12, 13] that it is desirable to keep 15 Raccept near for aslong as possible. We acplish this by using the value of Raccept to control a range limiter only interchanges of blocks that are less than or equal to Dlimit units apart in the x and y directions are attempted. A small value of Dlimit increases Raccept by ensuring that only blocks which are close together are considered for swapping. These“l(fā)ocal swaps” tend to result in relatively small changes in the placement cost, increasing their likelihood of acceptance. Initially, Dlimit is set to the entire chip. Whenever the temperature is reduced, the value of Dlimit is updated according to, and then clamped to the range 1 163。 VPR的主要設(shè)計(jì)目標(biāo)之一是保持足夠的靈活性,允許工具使用在很多 FPGA架構(gòu)的研究上。上可以找到 6結(jié)論和未來(lái)工作 我們已經(jīng)提出了一個(gè)優(yōu)于所有這類(lèi)工具的新的 FPGA布局布線(xiàn)工 具,它讓我們可以進(jìn)行直接的比較。為了鼓勵(lì)其它 FPGA研究人員公布的結(jié)果,以這些大型路由基準(zhǔn),我們發(fā)出以下 “FPGA的挑戰(zhàn)。仿真無(wú)法成功,因?yàn)槭兰芜\(yùn)行路由內(nèi)存不足。每個(gè)電路被放置在最小的正方形FPGA可 以包含它的路由并且輸入引腳 doglegs 是不允許的。 大電路 在第 54至 358的邏輯基準(zhǔn)塊范圍內(nèi)使用面積計(jì)算顯然太小,因?yàn)檫@是特殊的 FPGA。 8 Doglegs 實(shí)驗(yàn) 比較了 VPR與 SPLACE / SROUTE工具,不允許輸入引腳 doglegs 的性能。列出所有電路邏輯快的消息清單。在本節(jié)中我們比較了所需的最低數(shù)目,每一條成功的路徑和CAD工具的路由設(shè)置。我們將把這個(gè)作為一個(gè)輸入管腳 doglegs。每個(gè)邏輯塊的輸入或輸出連接任何相鄰?fù)ǖ溃?s)(即 Fc的 =寬)。 5實(shí)驗(yàn)結(jié)果 各種 FPGA在本節(jié)中使用的參數(shù),總是選擇與先前參數(shù)有明顯對(duì)比的那些參數(shù)。當(dāng)達(dá)到凈水槽值時(shí),加入所有路由資源分部需要連接水槽和目前的局部路由成本為 0 的波前(即擴(kuò)展列表)。不幸的是,這種方法需要高扇出網(wǎng)絡(luò)相當(dāng)多的 CPU時(shí)間。在第一次調(diào)用迷宮路由波從凈源擴(kuò)大,直到它到達(dá)任何的 K – 1值之后。對(duì)于本文的實(shí)驗(yàn)結(jié)果,我們?cè)O(shè)置路由器的最大數(shù)量迭代為 45,如果電路中路由沒(méi)有成功,一定數(shù)目的目錄中 45迭代就被假定為不可路由通道的寬度。 基本上該算法由最初各條線(xiàn)路的最短路徑找到網(wǎng), 無(wú)論任何接線(xiàn)段或邏輯 塊管腳,都可能會(huì)導(dǎo)致過(guò)度使用。每當(dāng)溫度降低, Dlimit整個(gè)芯片的尺寸為這個(gè)結(jié)果退火的第一部分,逐漸萎縮退火過(guò)程中的中間階段,并正在為退火低溫第 1部分最后設(shè)計(jì)余量,當(dāng) T退火終止 “*成本 / Ns。塊是小于或等于交匯處的值,Dlimit單位除了在 X和 Y方向嘗試。相反,如果動(dòng)作是很少被接受( 因溫度當(dāng)前正處于低位,安置相當(dāng)高的品質(zhì)),也有不少改善成本。正如在 [12],默認(rèn)號(hào)碼的行為在每個(gè)溫度都有評(píng)價(jià)。讓 Nblocks 是總數(shù)邏輯塊加的 I / O口電路中的數(shù)量。在這種情況下,賈夫是一個(gè)常數(shù),函數(shù)的線(xiàn)性阻塞耗費(fèi)降低到一個(gè)包圍盒的成本函數(shù)。它的價(jià)值取決于凈 N兩端號(hào)碼 。我們已經(jīng)嘗試與幾個(gè)不同的成本函數(shù)聯(lián)系,發(fā)現(xiàn)我們稱(chēng)之為線(xiàn)性擠塞的成本函數(shù)提供了一個(gè)合理 的計(jì)算時(shí)間,最好的結(jié)果 [8]。 VPACK可以針對(duì)邏輯塊組成一個(gè) LUT,如圖 2所示,因?yàn)檫@是一種常見(jiàn)的 FPGA邏輯元件。雖然 VPR最初是島式 FPGA的開(kāi)發(fā) [2, 3],它也可以和以行為為基礎(chǔ)的 FPGA應(yīng)用 [4]。 3 當(dāng)前的體系結(jié)構(gòu)描述格式不允許跨越多個(gè)領(lǐng)域和多個(gè)邏輯塊和被列入路由體系結(jié)構(gòu),但我們目前加入此功能。 VPR 的輸出由布局、布線(xiàn)和統(tǒng)計(jì)組成,評(píng)估一項(xiàng)有用的工具 FPGA 架構(gòu),如路由線(xiàn)長(zhǎng),跟蹤計(jì)數(shù)最大凈長(zhǎng)度。在第 6 節(jié)得出了我們的結(jié)論,并提出一些 VPR將來(lái)的升級(jí)。路由相優(yōu)于所有的 VPR在查看 FPGA的路由器方面,任何標(biāo)準(zhǔn)基準(zhǔn)測(cè)試的結(jié)果都可用,并且指出 VPR的砂礦和路 由器的組合勝過(guò)所有出版的FPGA布局和布線(xiàn)工具。因此,有相當(dāng)大的對(duì)于靈活 CAD工具的 需求,這樣才可以針對(duì)各種架構(gòu)的 FPGA 做高效的設(shè)計(jì),從而便于比較均勻的設(shè)計(jì)架構(gòu)。VPR是針對(duì)一個(gè)范圍廣泛的 FPGA架構(gòu)的能力,并且源代碼是公開(kāi)的。 1 譯 文 VPR:一種新的包裝,布局和布線(xiàn)工具的 FPGA研究 沃恩貝茨和喬納森羅斯 系電氣與計(jì)算機(jī)工程系,多倫多大學(xué) 多倫多, ON,加拿大 M5S3G4{沃恩, jayar} 摘 要 我們描述了一個(gè)基于 FPGA新的功能和 CAD工具使用的算法,各種途徑和方( VPR)。我們目前的版圖和路由上的大型電路的一套新的 結(jié)果,讓未來(lái)的基準(zhǔn)電路尺寸上的設(shè)計(jì)方法更多,用于今天的典型的 FPGA布局布線(xiàn)工具工業(yè)品外觀(guān)設(shè)計(jì)。也就是說(shuō)評(píng)估基準(zhǔn)電路技術(shù)映射,放置和 FPGA的布線(xiàn)結(jié)構(gòu)上的關(guān)系和措施的架構(gòu)質(zhì)量,如運(yùn)算速度或區(qū)域,然后可以很容易地提取出來(lái)。 為了使 FPGA體系結(jié)構(gòu)的比較有意義,它是至關(guān)