Home > Annual Research Review 2007

Welcome
Tech Prgm at a Glance
Registration
Parking
Directions to UCLA
Hotels
Abstracts for
Circuits and Embedded Systems Sessions
     
CES Session 1A
Power-Area-Throughput Optimization of VLSI Signal Processing Algorithms
Rashmi Nanda, Victoria Wang, Dejan Markovic

As electronic systems get more and more complex, the demands for implementations which minimize power and area costs while maintaining the desired throughput is ever increasing. This work explores various architectural techniques for optimization in the energy-delay-performance space. Techniques like retiming, pipelining, time multiplexing and parallelism provide varying levels of flexibility essential for tuning the design such that it meets the power and area constraints. The final objective is to achieve the desired throughput, functionality and latency requirements of the algorithm at minimum power and area cost, taking into account unique features of scaled technology such as leakage and variability.
CES Session 1B
A Hardware Friendly 4x4 Linear MIMO Detector for MIMO-OFDM Based Systems
Hun-Seok Kim, Weijun Zhu, Jatin Bhatia, Karim Mohammed, Anish Shah, Babak Daneshrad
Designing a hardware-efficient MIMO detector for broadband systems is a challenging proposition due to its excessive algorithm complexity, throughput/latency requirements and numerical stability issues. This paper studies hardware-friendly linear MMSE MIMO detection algorithms in terms of algorithm complexity and numerical stability. The modified Gram-Schmidt QR decomposition algorithm combined with square-root linear MMSE detection was selected for FPGA implementation based on the complexity analysis and performance simulations on the IEEE802.11n MIMO channel model. This paper suggests a dynamic scaling technique which enhances numerical stability of the square-root MMSE detection algorithm significantly. A hardware-friendly linear MMSE MIMO detector was successfully implemented on FPGAs with multiplier sharing architecture. Implemented MIMO detector is fully compliant to IEEE 802.11n draft proposal with the maximum throughput of 480Mbps. The proposed decoder has been implemented on a Virtex 2 -8000 device and has been integrated into a complete 802.11n based research testbed. Compared with previously published results, this MIMO decoder achieves 2x throughput improvement with 1/3 the hardware complexity.
CES Session 1C
An Energy-Efficient Reconfigurable Multiprocessor IC for DSP Applications
Guichang Zhong and Alan Willson
A reconfigurable multi-processor system having eight VLIW processors arranged in a dual ring topology on a single integrated circuit has been fabricated in TSMC 0.18-um CMOS. It provides excellent flexibility for a variety of DSP applications while minimizing power consumption for a given performance level. Various energy-efficient DSP applications are implemented and presented.

     
CES Session 2A
An Adaptive Low-Jitter LC-Based Clock Distribution
Li-Min Lee and Chih-Kong Ken Yang
A low-jitter LC-based clock distribution has been demonstrated in 0.13µm CMOS. A frequency tuning technique based on a voltage-swing digitizer is demonstrated. An optimum jitter performance can be achieved according to the relative level between the input noise and the inherent noise of the clock buffer by adaptively adjusting the ratio between an injection-locked oscillator and a LC-resonant buffer. The jitter optimization efficiency of this technique is better than a 25-% increase in power.
     
CES Session 2B
A Spur Elimination Technique for Fractional-N Phase Locked Loops Based on VCO Phase Interpolation
Siamak Delshadpour and Sudhakar Pamarti
A fractional spur elimination technique that enables wide bandwidth phase interpolation based fractional-N phase locked loops (PLLs) is proposed. The technique uses specially filtered dither to eliminate the spurious tones otherwise caused by inaccuracies in phase interpolation. The design of a wide bandwidth fractional-N PLL based on the spur elimination technique is also presented.
     
CES Session 2C
Minimal Skew Clock Embedding Considering Time Variant Temperature Variation
Hao Yu, Yu Hu, Chuenchen Liu, and Lei He
The existing temperature-aware clock embedding assumes a time-invariant worst case temperature map and determines the merging point based on a geometrical search along the thermal gradient. However, it is not solved how to find the worst-case temperature map leading to the worst case skew. In this paper, we develop a PErturbation based Clock Optimization (PECO) considering the time-variant temperature gradient with automatic correlation extraction. For a given clock topology, we minimize the worst case skew without asking for the worst case temperature map. We decide the merging point level by level based on the sensitivity of the skew with respect to the change of merging point. Such sensitivity is calculated using a parameterized model, which is compressed by a singular-value-decomposition (SVD) and K-means based clustering considering the temperature correlation. The experimental results show that our algorithm reduces worst-case skew by up to 5X compared to the existing bounded-skew based ZST/DME method with small (up to 1%) wirelength overhead.

    
CES Session 3A
A 2.2 Gb/s CMOS DifferentialQPSK Direct-Conversion Analog Baseband Receiver for 60-GHz Links
Minghui Chen and Frank M.C. Chang
This paper describes a CMOS DQPSK direct-conversion baseband receiver that features 2.2 Gb/s data rate to support 1920x1080 60 Hz interlaced HDTV transmission in the 60 GHz unlicensed band. First, the architecture and demodulation scheme for direct-conversion receiver are provided. As no ADC is required in the receiver chain, low cost and power saving are its significant advantages. Next, building blocks such as 42 dB gain, 1.4 GHz bandwidth VGA with feedback DC offset cancellation, pipeline sample-and-hold, multi-level bang-bang PLL clock recovery, CMOS four quadrant multipliers, etc. are detailed. The receiver is fabricated in a 90-nm CMOS digital process and achieves a maximal bit rate of 2.4 Gb/s. The BER is measured to be 10-9. The DC supply voltage is 1 V and total power consumption is only 85 mW.
    

CES Session 3B
Modeling Op Amp Nonlinearity in  Switched-Capacitor Sigma-Delta
Modulator

Khaled Mahmoud Abdelfattah and Behzad Razavi

A system-level methodology for the inclusion of op amp nonlinearity in discrete-time integrators and Sigma-Delta modulators is proposed that consists of a hyperbolic tangent model for the input/output characteristic of op amps and a recursive solution of nonlinear integrators. Simulations at different levels of abstraction indicate that the methodology incurs an error of no more than 1.1 dB in the magnitude of harmonics while providing a 50x advantage in the simulation speed with respect to transistor-level implementations.
    

CES Session 3C
60GHz CMOS Differential Power Amplifier Using On-Chip Transformers for Compact Design
Tim Larocca and Frank M.C. Chang
A differential CMOS PA capable of 60GHz performance utilizing a non-traditional MMIC architecture is presented. The core of the design relies on 2 turn on-chip transformers with 60mA current capability for power coupling. Power gains of 15dB with associated saturated powers above 14dBm and greater than 15% PAE are realized at 60GHz, over a 3dB bandwidth of 6GHz. The supply voltage is 1.2V with 140mA total drain current. Total gate periphery per side of each differential power pair is 115.2µm in 90nm commercial CMOS.
    
CES Session 4A
A System for Coarse Grained Memory Protection in Tiny Embedded
Processors

Ram Kumar, Akhilesh Singhania, Andrew Castner, and Mani B. Srivastava
Many embedded systems contain resource constrained microcontrollers where applications, operating system components and device drivers reside within a single address space with no form of memory protection. Programming errors in one application can easily corrupt the state of the operating system and other applications on the microcontroller. In this talk we propose a system that provides memory protection in tiny embedded processors (8, 16 and 32-bit microcontrollers with limited resources). Our system consists of a software run-time working with minimal low-cost architectural extensions to the processor core that prevents corruption of state by buggy applications.
     We restrict memory accesses and control flow of applications to protection domains within the address space. The software run-time consists of a Memory map: a flexible and efficient data structure that records ownership and layout information of the entire address space. Memory map checks are done for store instructions by hardware accelerators that significantly improve the performance of our system. We preserve control flow integrity by maintaining a Safe stack that stores return addresses in a protected memory region. Domain switches within a single address space is done with the help of a cross domain linker tool that generates a software jump table.
Enhancements to the microcontroller call and return instructions use the jump table to track the current active domain. We have implemented our scheme on a VHDL model of ATMEGA103 microcontroller. Our evaluations show that embedded applications can enjoy the benefits of memory protection through modest increase in cost and area of the microcontroller.
    
CES Session 4B
A New Energy-Aware Embedded Platform for Networked Sensing
Dustin McIntire and Willliam J. Kaiser
A broad range of embedded networked sensor (ENS) systems for critical environmental monitoring applications now require complex, high peak power dissipating sensor devices, as well as on demand high performance computing and high bandwidth communication. Embedded computing demands for these new platforms include support for computationally intensive image and signal processing as well as optimization and statistical computing. To meet these new requirements while maintaining critical support for low energy operation, a new multiprocessor node hardware and software architecture, Low Power Energy Aware Processing (LEAP), has been developed. The LEAP architecture integrates fine-grained, high time resolution, energy dissipation monitoring and power control scheduling for all subsystems including multiple processors, memory, storage, network interface, and sensor systems.
CES Session 4C
Software Radio Implementation of Short-range Wireless Standards  for Sensor Networking
Thomas Schmid and Mani B. Srivastava
Software Defined Radios can provide significant benefits as backend gateways or base stations for sensor networks, which do not face the stringent resource constraints of in-network nodes. We extended GNU Radio with two physical layer implementations of IEEE 802.15.4 and an FSK modulation, and use the Universal Software Radio Peripheral (USRP) to interoperate with the Chipcon CC1000 and CC2420 radios found on the popular Mica2, MicaZ and Telos B motes. The wideband nature of the USRP makes it feasible for a single SDR base station to simultaneously communicate on multiple independent channels, and provide network bridging across incompatible radio standards.
     Cognitive Radios and Software Defined Radios are part of a bigger class of cognitive systems, where also sensors belong to. In this talk I will briefly talk about possibilities of using GNU Radio not only for SDR systems, but also for data processing in sensor networks where it can act as a signal processing tool box.