<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!DOCTYPE article SYSTEM "http://www.adv-radio-sci.net/inc/ars/copernicus.dtd">
<article language="en">
	<journal>
		<journal_title>Advances in Radio Science</journal_title>
		<journal_url>www.adv-radio-sci.net</journal_url>
		<issn>1684-9965</issn>
		<eissn>1684-9973</eissn>
		<volume_number>7</volume_number>
		<volume_title>Kleinheubacher Berichte 2008</volume_title>
		<publication_year>2009</publication_year>
	</journal>
	<doi>10.5194/ars-7-95-2009</doi>
	<article_url>http://www.adv-radio-sci.net/7/95/2009/</article_url>
	<abstract_html>http://www.adv-radio-sci.net/7/95/2009/ars-7-95-2009.html</abstract_html>
	<fulltext_pdf>http://www.adv-radio-sci.net/7/95/2009/ars-7-95-2009.pdf</fulltext_pdf>
	<start_page>95</start_page>
	<end_page>100</end_page>
	<publication_date>2009-05-18</publication_date>
	<article_title content_type="html">A VLSI design concept for parallel iterative algorithms</article_title>
	<authors>
		<author numeration="1" affiliations="1">
			<name>C. C. Sun</name>
			<email>chichia.sun@tu-dortmund.de</email>
		</author>
		<author numeration="2" affiliations="1">
			<name>J. Götze</name>
		</author>
	</authors>
	<affiliations>
		<affiliation numeration="1" content_type="html">Dortmund University of Technology, Information Processing Lab, Otto-Hahn-Str. 4, 44227 Dortmund, Germany</affiliation>
	</affiliations>
	<abstract content_type="html">Modern VLSI manufacturing technology has kept shrinking down to the nanoscale
level with a very fast trend. Integration with the advanced nano-technology
now makes it possible to realize advanced parallel iterative algorithms
directly which was almost impossible 10 years ago. In this paper, we want to
discuss the influences of evolving VLSI technologies for iterative algorithms
and present design strategies from an algorithmic and architectural point of
view. Implementing an iterative algorithm on a multiprocessor array, there is
a trade-off between the performance/complexity of processors and the
load/throughput of interconnects. This is due to the behavior of iterative
algorithms. For example, we could simplify the parallel implementation of the
iterative algorithm (i.e., processor elements of the multiprocessor
array) in any way as long as the convergence is guaranteed. However, the
modification of the algorithm (processors) usually increases the number of
required iterations which also means that the switch activity of
interconnects is increasing. As an example we show that a 25&amp;times;25 full
Jacobi EVD array could be realized into one single FPGA device with the
simplified μ-rotation CORDIC architecture.</abstract>
	<references>
		<reference numeration="1" content_type="text"> Ahmedsaid, A., Amira, A., and Bouridane, A.: Improved SVD systolic array and implementation on FPGA, in: IEEE International Conference on Field-Programmable Technology (FPT), pp. 3–42, 2003. </reference>
		<reference numeration="2" content_type="text"> Brent, R P. and Luk, F T.: The Solution of Singular-Value and Symmetric Eigenvalue Problems on Multiprocessor Arrays, SIAM Journal on Scientific and Statistical Computing, 6, 69–84, 1985. </reference>
		<reference numeration="3" content_type="text"> Gelsinger, P.: Moore&apos;s Law: &quot;We See No End in Sight,&quot;, Tech. rep., Intel Chief Technology Officer, prefixhttp://websphere.sys-con.com/node/557154, 2008. </reference>
		<reference numeration="4" content_type="text"> Goetze, J. and Hekstra, G.: An Algorithm and Architecture Based on Orthonormal Micro-Rotations for Computing the Symmetric EVD, INTEGRATION – The VLSI Journal, 20, 21–39, 1995. </reference>
		<reference numeration="5" content_type="text"> Gotze, J., Paul, S., and Sauer, M.: An Efficient Jacobi-Like Algorithm for Parallel Eigenvalue Computation, IEEE Transactions on Computers, 42, 1058–1065, 1993. </reference>
		<reference numeration="6" content_type="text"> Klauke, S. and Goetze, J.: Low Power Enhancements for Parallel Algorithms, in: IEEE International Symopsium on Circuits and Systems, 2001. </reference>
		<reference numeration="7" content_type="text"> Parhi, K K. and Nishitani, T.: Digial Signal Processing for Multimedia Systems, MARCEL DEKKER, New York, 1999. </reference>
		<reference numeration="8" content_type="text"> Sainarayanan, K S., Raghunandan, C., and Srinivas, M.: Delay and Power Minimization in VLSI Interconnects with Spatio-Temporal Bus-Encoding Scheme, in: IEEE Computer Society Annual Symposium on VLSI, pp. 401–408, 2007. </reference>
		<reference numeration="9" content_type="text"> Stine, J E., Castellanos, I., Wood, M., Henson, J., Love, F., Davis, W R., Franzon, P D., Bucher, M., and Basavarajaiah, S.: FreePDK: An Open-Source Variation-Aware Design Kit, in: IEEE International Conference on Microelectronic Systems Education, pp. 173–174, 2007. </reference>
		<reference numeration="10" content_type="text"> Vangal, S., Howard, J., Ruhl, G., Dighe, S., Wilson, H., Tschanz, J., Finan, D., Iyer, P., Singh, A., Jacob, T., Jain, S., Venkataraman, S., Hoskote, Y., and Borkar, N.: An 80-Tile 1.28TFLOPS Network-on-Chip in 65 nm CMOS, Solid-State Circuits Conference, 2007. ISSCC 2007. Digest of Technical Papers. IEEE International, pp. 98–589, 2007. </reference>
		<reference numeration="11" content_type="text"> Vitullo, F., L&apos;Insalata, N. E., Petri, E., Saponara, S., Fanucci, L., Casula, M., Locatelli, R., and Coppola, M.: Low-Complexity Link Microarchitecture for Mesochronous Communication in Networks-on-Chip, IEEE Transactions on Computer, 57, 1196–1201, 2008. </reference>
		<reference numeration="12" content_type="text"> Volder, J.: The CORDIC trigonometric computing technique, IRE Trans. Electron. Comput., EC-8, 330–334, 1959. </reference>
		<reference numeration="13" content_type="text"> Walther, J.: A unified algorithm for elementary functions, in: Proc. Spring Joint Comput. Conf., vol 38, pp. 379–385, 1971. </reference>
		<reference numeration="14" content_type="text"> Wolf, W.: The future of multiprocessor systems-on-chips, in: Annual ACM IEEE Design Automation Conference, pp. 681–685, 2004. </reference>
	</references>
</article>

