(点击 4 次)

Minimize power consumption with FPGAs in high-speed, DSP-intensive system designs

Reducing power consumption is increasingly important for high-speed, DSP-intensive system designs. For example, in communications systems, communications must be enabled in periodic bursts, eliminating constant power usage by the amplifier and the rest of the system. In a sensor network, the requirement is to periodically turn off active sensors (i.e., for traffic imagery or weather sensors), or turn them on (i.e., in the event of an earthquake), and burst into sleep mode by uploading information before returning to the device.

Authors: Govind Krishnan, Hichem Belhadj, Madhubabu Anumukonda

Reducing power consumption is increasingly important for high-speed, DSP-intensive system designs. For example, in communications systems, communications must be enabled in periodic bursts, eliminating constant power usage by the amplifier and the rest of the system. In a sensor network, the requirement is to periodically turn off active sensors (i.e., for traffic imagery or weather sensors), or turn them on (i.e., in the event of an earthquake), and burst into sleep mode by uploading information before returning to the device. In medical monitoring devices, which typically have relatively low sampling rates, low-power features are required for periodic operation as a means of minimizing power consumption, as are hand-held portable solutions.

For power-sensitive DSP-intensive system designs, the designer must not only provide the lowest static power consumption, but must also focus on the lowest possible total power consumption, especially at high frequencies and high temperatures. Field Programmable Gate Arrays (FPGAs) achieve this through a comprehensive approach to reducing power consumption. This approach includes the design of process technology, architecture and configurable logic, as well as embedded functions including SERDES, DDR2/3 and DSP blocks, and includes special power modes that further reduce power consumption even below static power. This article focuses on the development of FPGA technology to address the DSP challenges of low-power DSP-intensive system design.

FPGA Evolution

Over the past two decades, many advanced CPUs and MCUs have built various power-saving modes to address the power challenges posed by higher frequencies and higher levels of integration in DSP-intensive designs. Only the most advanced FPGAs can provide similar advanced low-power features while supporting higher frequency devices. Only recently, FPGAs have been around to address the leakage issues of earlier SRAM-based solutions, while also providing access to low-power modes for additional power-saving features.

In general, three power components play an important role in total power consumption as it relates to the FPGA power budget: static power, dynamic power, and inrush power. These three components must be efficiently managed to achieve the lowest power consumption.

Managing these components requires inherently low leakage currents—an important attribute if the FPGA is to support the power requirements of DSP-intensive designs. Flash-based FPGA solutions have advantages in this regard compared to FPGAs using SRAM cells because flash-based FPGAs are built with a single transistor (instead of six), and both configuration and inrush (during power-up) power are zero. An SRAM FPGA powers up in an unconfigured state and needs to complete an initial power-up and reset sequence. At first, the various configuration bits were in an unknown state and needed to be initialized on every power-up. As a result, there is a current surge that can generate spikes of up to several amps in a few hundred microseconds, causing a power inrush (see Figure 1).


Figure 1: Hundreds of milliwatts (mW) of power are removed during device startup and configuration of a flash-based FPGA. To avoid large current spikes, SRAM FPGAs require complex power sequencing, increasing component cost and space.

To mitigate this current spike, many SRAM FPGAs add complex power sequencing requirements to the system. On the other hand, non-volatile flash-based FPGAs do not require external configuration devices for reprogrammability, eliminating hundreds of milliwatts (mW) at device startup and the need for external device mitigations. In some cases, flash-based FPGAs can provide up to 1,000 times lower leakage per cell than SRAM-based solutions, while having ultra-low quiescent current and requiring no external equipment for Mitigation advantage.

In addition to the inherent low power requirements of flash-based FPGAs, other features can be leveraged to further reduce power consumption. Today’s flash-based FPGAs combine hard IP blocks and an FPGA on a single chip, integrating the FPGA with a full-featured microcontroller system, enhanced FPGA fabric, and high-speed serial and memory interfaces. Other power-conscious features and other features include:

SERDES Enhancements: The latest FPGAs reduce power per Gbps per SERDES channel to as low as 13mW, a reduction of up to 5X compared to other cost-optimized FPGA solutions with similar capabilities.

Integrate many different hard IP and other resources in smaller devices: By including more I/O, transceivers, PCI Express endpoints, and high-performance memory subsystems, you can Provides more functionality.

Embedded RAM and Math Blocks: Flash FPGAs include built-in hard RAM blocks and math blocks for intensive DSP applications. Additionally, these modules offer low power consumption at high performance levels.

Embedded processor subsystems with inherently low power consumption: Some subsystems offer multiple low-power modes, including sleep mode and deep sleep mode. Use low-power modes to quickly stop and start the FPGA fabric and associated I/O, while preserving the state of the FPGA fabric and significantly reducing power consumption. It takes about 100 microseconds to enter sleep mode and about 100 microseconds to exit. During this time, the state of the FPGA is maintained so that upon exit, the device continues to run from where it left off.

Minimize power consumption with additional tools: Users can further optimize their designs for lower power consumption by using various tools to calculate power consumption curves and intelligent floorplanning and power-optimized placement and routing.

All of these power-reducing features and capabilities are especially important in high-speed, DSP-intensive system designs.

DSP Design Challenges

DSP-intensive system designs require complex arithmetic computations, high memory bandwidth requirements, and high-speed serial transfers with dynamic reconfiguration. These requirements consume a lot of power at high performance levels. Next-generation FPGAs must be able to meet these requirements with the lowest possible power consumption without compromising performance. DSP system designers use many different building blocks (multipliers, memories, transceivers, etc.) in their designs, and depending on the FPGA used, the power consumption achieved by different system architectures can vary significantly.

All FPGAs have hard multipliers as the basic computational unit and play a critical role in the total power within the overall system power budget. To investigate this, Microsemi investigated finite impulse response (FIR) filters with different architectures and analyzed the power consumption of each filter in terms of the number of multipliers versus operating frequency.

An FIR filter is a DSP block that is often used to remove unwanted noise while improving signal quality, or to adjust the signal spectrum in various applications. There are several FIR filter architectures, including transpose and shrink (with or without symmetry). Each of these two architectures has special characteristics related to total initial latency, number of DSP blocks, throughput or performance, and number of pipeline registers. The differences between the two architectures are shown in Figure 4, which shows the symmetric version of the transposed and contracted 16-Tap FIR.


Figure 4: Comparison of architectures used in symmetric transpose and shrink 16-Tap FIR.

To summarize the difference between the two architectures, the systolic architecture uses pipeline stages, reducing the input fan-out to increase the operating frequency; but at the same time, the initial latency of the N-Tap systolic FIR is (2*N -2) cycles. In contrast, although transposed architectures operate at lower frequencies, they have better initial latency (N-1 cycles) and they use fewer sequential resources. There are other issues with these architectures to consider. One of the most important factors is the stability of the filter, especially when there are a large number of taps and weighting features must be considered. For example, in speech processing applications requiring echo cancellation, the weights need to be higher at the near end where most of the echoes are present, and lower weights need to be lower on later filter taps where there is less echo.

FPGA power consumption can vary widely depending on the architecture used. In one study, a power estimation tool was used and actual silicon measurements were performed at various temperatures on FPGA development kits implemented with 32-, 64-, and 128-tap transpose FIRs. Research has shown that, when properly designed and implemented, FPGAs can significantly reduce power consumption. Furthermore, these savings are more pronounced at lower frequencies and higher temperatures. Another key finding is that for the best performing FPGAs, power consumption scales linearly with the number of taps. In other words, some underperforming FPGAs have worse power consumption figures when the number of taps is low, and in other cases, the power consumption figures are worse when the number of taps is high. This may be due to architectural issues.


Figure 5: 32, 64, 128-tap FIR total power values ​​from different FPGA vendors.

in conclusion

Today’s DSP-centric system designs are under increasing pressure to minimize power consumption in a variety of applications. By reducing total power consumption, not just static power, today’s flash-based FPGA technology plays a key role in enabling next-generation high-speed, DSP-intensive system designs that must deliver high algorithm performance and performance in ever-shrinking form factors. lowest possible power consumption.

Authors: Govind Krishnan, Hichem Belhadj, Madhubabu Anumukonda

Reducing power consumption is increasingly important for high-speed, DSP-intensive system designs. For example, in communications systems, communications must be enabled in periodic bursts, eliminating constant power usage by the amplifier and the rest of the system. In a sensor network, the requirement is to periodically turn off active sensors (i.e., for traffic imagery or weather sensors), or turn them on (i.e., in the event of an earthquake), and burst into sleep mode by uploading information before returning to the device. In medical monitoring devices, which typically have relatively low sampling rates, low-power features are required for periodic operation as a means of minimizing power consumption, as are hand-held portable solutions.

For power-sensitive DSP-intensive system designs, the designer must not only provide the lowest static power consumption, but must also focus on the lowest possible total power consumption, especially at high frequencies and high temperatures. Field Programmable Gate Arrays (FPGAs) achieve this through a comprehensive approach to reducing power consumption. This approach includes the design of process technology, architecture and configurable logic, as well as embedded functions including SERDES, DDR2/3 and DSP blocks, and includes special power modes that further reduce power consumption even below static power. This article focuses on the development of FPGA technology to address the DSP challenges of low-power DSP-intensive system design.

FPGA Evolution

Over the past two decades, many advanced CPUs and MCUs have built various power-saving modes to address the power challenges posed by higher frequencies and higher levels of integration in DSP-intensive designs. Only the most advanced FPGAs can provide similar advanced low-power features while supporting higher frequency devices. Only recently, FPGAs have been around to address the leakage issues of earlier SRAM-based solutions, while also providing access to low-power modes for additional power-saving features.

In general, three power components play an important role in total power consumption as it relates to the FPGA power budget: static power, dynamic power, and inrush power. These three components must be efficiently managed to achieve the lowest power consumption.

Managing these components requires inherently low leakage currents—an important attribute if the FPGA is to support the power requirements of DSP-intensive designs. Flash-based FPGA solutions have advantages in this regard compared to FPGAs using SRAM cells because flash-based FPGAs are built with a single transistor (instead of six), and both configuration and inrush (during power-up) power are zero. An SRAM FPGA powers up in an unconfigured state and needs to complete an initial power-up and reset sequence. At first, the various configuration bits were in an unknown state and needed to be initialized on every power-up. As a result, there is a current surge that can generate spikes of up to several amps in a few hundred microseconds, causing a power inrush (see Figure 1).


Figure 1: Hundreds of milliwatts (mW) of power are removed during device startup and configuration of a flash-based FPGA. To avoid large current spikes, SRAM FPGAs require complex power sequencing, increasing component cost and space.

To mitigate this current spike, many SRAM FPGAs add complex power sequencing requirements to the system. On the other hand, non-volatile flash-based FPGAs do not require external configuration devices for reprogrammability, eliminating hundreds of milliwatts (mW) at device startup and the need for external device mitigations. In some cases, flash-based FPGAs can provide up to 1,000 times lower leakage per cell than SRAM-based solutions, while having ultra-low quiescent current and requiring no external equipment for Mitigation advantage.

In addition to the inherent low power requirements of flash-based FPGAs, other features can be leveraged to further reduce power consumption. Today’s flash-based FPGAs combine hard IP blocks and an FPGA on a single chip, integrating the FPGA with a full-featured microcontroller system, enhanced FPGA fabric, and high-speed serial and memory interfaces. Other power-conscious features and other features include:

SERDES Enhancements: The latest FPGAs reduce power per Gbps per SERDES channel to as low as 13mW, a reduction of up to 5X compared to other cost-optimized FPGA solutions with similar capabilities.

Integrate many different hard IP and other resources in smaller devices: By including more I/O, transceivers, PCI Express endpoints, and high-performance memory subsystems, you can Provides more functionality.

Embedded RAM and Math Blocks: Flash FPGAs include built-in hard RAM blocks and math blocks for intensive DSP applications. Additionally, these modules offer low power consumption at high performance levels.

Embedded processor subsystems with inherently low power consumption: Some subsystems offer multiple low-power modes, including sleep mode and deep sleep mode. Use low-power modes to quickly stop and start the FPGA fabric and associated I/O, while preserving the state of the FPGA fabric and significantly reducing power consumption. It takes about 100 microseconds to enter sleep mode and about 100 microseconds to exit. During this time, the state of the FPGA is maintained so that upon exit, the device continues to run from where it left off.

Minimize power consumption with additional tools: Users can further optimize their designs for lower power consumption by using various tools to calculate power consumption curves and intelligent floorplanning and power-optimized placement and routing.

All of these power-reducing features and capabilities are especially important in high-speed, DSP-intensive system designs.

DSP Design Challenges

DSP-intensive system designs require complex arithmetic computations, high memory bandwidth requirements, and high-speed serial transfers with dynamic reconfiguration. These requirements consume a lot of power at high performance levels. Next-generation FPGAs must be able to meet these requirements with the lowest possible power consumption without compromising performance. DSP system designers use many different building blocks (multipliers, memories, transceivers, etc.) in their designs, and depending on the FPGA used, the power consumption achieved by different system architectures can vary significantly.

All FPGAs have hard multipliers as the basic computational unit and play a critical role in the total power within the overall system power budget. To investigate this, Microsemi investigated finite impulse response (FIR) filters with different architectures and analyzed the power consumption of each filter in terms of the number of multipliers versus operating frequency.

An FIR filter is a DSP block that is often used to remove unwanted noise while improving signal quality, or to adjust the signal spectrum in various applications. There are several FIR filter architectures, including transpose and shrink (with or without symmetry). Each of these two architectures has special characteristics related to total initial latency, number of DSP blocks, throughput or performance, and number of pipeline registers. The differences between the two architectures are shown in Figure 4, which shows the symmetric version of the transposed and contracted 16-Tap FIR.


Figure 4: Comparison of architectures used in symmetric transpose and shrink 16-Tap FIR.

To summarize the difference between the two architectures, the systolic architecture uses pipeline stages, reducing the input fan-out to increase the operating frequency; but at the same time, the initial latency of the N-Tap systolic FIR is (2*N -2) cycles. In contrast, although transposed architectures operate at lower frequencies, they have better initial latency (N-1 cycles) and they use fewer sequential resources. There are other issues with these architectures to consider. One of the most important factors is the stability of the filter, especially when there are a large number of taps and weighting features must be considered. For example, in speech processing applications requiring echo cancellation, the weights need to be higher at the near end where most of the echoes are present, and lower weights need to be lower on later filter taps where there is less echo.

FPGA power consumption can vary widely depending on the architecture used. In one study, a power estimation tool was used and actual silicon measurements were performed at various temperatures on FPGA development kits implemented with 32-, 64-, and 128-tap transpose FIRs. Research has shown that, when properly designed and implemented, FPGAs can significantly reduce power consumption. Furthermore, these savings are more pronounced at lower frequencies and higher temperatures. Another key finding is that for the best performing FPGAs, power consumption scales linearly with the number of taps. In other words, some underperforming FPGAs have worse power consumption figures when the number of taps is low, and in other cases, the power consumption figures are worse when the number of taps is high. This may be due to architectural issues.


Figure 5: 32, 64, 128-tap FIR total power values ​​from different FPGA vendors.

in conclusion

Today’s DSP-centric system designs are under increasing pressure to minimize power consumption in a variety of applications. By reducing total power consumption, not just static power, today’s flash-based FPGA technology plays a key role in enabling next-generation high-speed, DSP-intensive system designs that must deliver high algorithm performance and performance in ever-shrinking form factors. lowest possible power consumption.


"Reducing power consumption is increasingly important for high-speed, …