“A Configurable Radiation Tolerant Dual-Ported Static RAM macro, designed in a 0.25 µm CMOS technology for applications in the LHC environment.”

8th Workshop on Electronics for LHC Experiments
9-13 Sept. 2002, Colmar, France

K. Kloukinas, G. Magazzu, A. Marchioro
CERN EP division, 1211 Geneva 23, Switzerland
Overview

- Motive of Work
- Description of the macro-cell design
- Experimental Results
- Conclusions
Motive of Work

Several Front-End ASICs for the LHC detectors are using the CERN DSM Design Kit in 0.25 μm commercial CMOS technology.

Many ASICs require the use of rather large memories in Readout Pipelines, Readout Buffers and FIFOs.

CERN DSM Design Kit lacks design automation tools for generating customized SRAM blocks.
Proposed Design

Built an SRAM macro-cell that can be configured in terms of *word counts* and *bit organization* by means of simple floorplanning procedures.

Initially designed for the needs of the “Kchip” Front-End ASIC used in the CMS ECAL Preshower detector.
CERN-SRAM specifications

**Scalable Design**
- Configurable Bit organization (n x 9-bit).
- Configurable Memory Size (128 – 4Kwords).

**Synchronous Dual-Port Operation**
- Permits Read/Write operations on the same clock cycle.
- Typical Operating Frequency: 40 MHz.

**Low Power Design**
- Full Static Operation.
- Divided Wordline Decoding.

**Radiation Tolerant Design**
Memory Cell

To minimize the macro-cell area a Single Port memory cell is used based on a conventional cross-coupled inverter scheme.

Gain in Memory Cell Layout Area = 18%
Memory Cell Design

Single Port memory cell
Interconnect: 3 metal layers
- 1st for local interconnects
- 2nd for vertical bitlines and power lines
- 3rd for horizontal wordlines
Memory Cell Area: 47.152 µm²
Dual-port functionality is realized with a time sharing access mechanism.

- Registered Inputs
- Latched Outputs
SRAM Interface Timing

WRITE

READ

READ/WRITE

Clk

WA

RA

W

R

Din

Dout

\( t_s \)  

\( t_H \)

\( t_s \)  

\( t_H \)

\( t_{acc} \)
SRAM macro-cell Design

Address Decoding
Address Mux Register

- Leaf cell is based on the D-F/F and the 2-input Mux standard cells found in the CERN DSM Design Kit.
- True & Complementary output with balanced timing.
- Easily sizeable by abutting the necessary number of leaf cells.
Row Decoder

- Decoder: 7 to 128
  - Hardwire-configured.
  - Pre-routed layout block.

- Dynamic NAND-type.
  - Speed, Area, Power advantages over the static NAND-type.
  - Latched output.
Column Decoder

Static NAND-type implementation
- Column decoding is one of the last actions to be performed in the read sequence.
- It can be executed in parallel with other functions, and can be performed as soon as address is available.
- Its propagation delay does not add to the overall memory access time.

Size Configurable
- Make use of Design kit standard cells.
- Decoding function is via-hole programmable.
Divided Wordline Decoding

- Reduced Power Consumption.
  - The non accessed portions of the memory remain in the precharge state.
- Improved Wordline Selection Time.
  - Since the RC delay in each divided wordline is small due to its short length.
Divided Wordline Decoding

Data In → Write Drivers → Write Drivers → Write Drivers → Write Drivers

SRAM Array → Word-line Buffers → Word-line Buffers → Word-line Buffers → SRAM Array

Address → Read Logic → Read Logic → Read Logic → Read Logic


Block Pre-Dec. → Block Select signals

Global Word-line → Local Word-line

Data Out
Data Path
Data Input Output Ports

Data Input Register
- Leaf cell is based on the D-F/F standard cell from CERN DSM Design Kit.
- True & Complementary output with balanced timing.

Data Output Latch
- Leaf cell is based on the Latch standard cell from CERN DSM Design Kit.

Easily sizeable by abutting the necessary number of leaf cells.
SRAM Data Path

- Data in
- Clk
- Write enable
- Bit Line
- BL
- precharge
- evaluate
- Word Line
- precharge
- evaluate
- Read Logic
- Write Drivers
- Latch
- Data out
- CLR
- Q
- SET
- Q
Substitution of the conventional sense amplifier with an asymmetric inverter.

- Reduced Power Consumption
- Stable operation at low power supply voltages.
- Acceptable performance for target applications.
- Easy to design.
Replica Techniques

---

**Scalability**
- Wordline select time depends on the size of the memory.
- Dummy Wordline with replica memory cells to track the wordline charge-discharge time.

**Bitline Timing**
- Dummy Bitlines to mimic the delay of the bitline path over all conditions.
Asynchronous internal timing of control signals.
Static operation.
Hand-shaking and transition detection to realize internal timing loops.
Timing loops are initiated by the system clock and terminated upon completion of the operation.
All control signals are forced back to their initial state to prepare for subsequent tasks.
During standby periods, bitlines and wordlines precharge-evaluate cycles are not initiated, thus keeping the Power Consumption to a minimum.
Operation and Timing

READ

WRITE

Clk
Cell Library

Size Configurable

Data Input Register
Address Mux Register
Column Decoder
Block Pre-Decoder
Output Data Latche

Row Decoder

WordLine Buffers

SRAM column, 128 x 9bits
(50.4 µm x 1086.2 µm)

Fixed Layout

Timing logic
Floorplanning

Column Decoder

Row Decoder (128 rows)

Block

Read logic
Write drv.
Read logic
Write drv.
Read logic
Write drv.
Read logic
Write drv.
Read logic
Write drv.
Read logic
Write drv.

DIn
DIn Reg
PreDecoder
Address Reg
Timing
Address Reg
CLK
W
R

WA

RA

DOut Latch

DOut

vdd
gnd
Digital Simulation

- SRAM verilog module (parameterized)
- Digital Simulator
  VERILOG
Logic Synthesis

Template
SRAM timing

Design Kit .lib file

Combined .lib file

Combined .db file

compilation

Logic Synthesis Tool
SYNOPSYS
Place & Route

Template
SRAM
timing

Design Kit
TLF file

Combined TLF file

compilation

Combined CTLF file

Layout view

Abstract view

LEF file

Place & Route Tool
Silicon Ensemble
Experimental Results

To prove the concept of the SRAM macro-cell scalability and to evaluate the performance of the proposed design we have fabricated two test chips:
- a 1Kwords X 9bits and
- a 4Kwords X 9bits.

Both chips were tested and found functional.
Submitted SRAM Chips

1st Prototype
Design: CERN_SRAM_1K
Configuration: 1K x 9 bit
Size: ~560µm x 1,300µm
Area: ~0.73mm²
Density: ~12.6Kbit/mm²

The Memory consists of
2 Blocks of 512 x 9bits.
Each Block is composed by
4 Columns of 128 X 9bits.
Submitted SRAM Chips

2\textsuperscript{nd} Prototype
Design: CERN\_SRAM\_4K
Configuration: 4K x 9 bit
Size: \sim1,850\mu m \times 1,300\mu m
Area: \sim2.4mm\textsuperscript{2}
Density: \sim15.4Kbit/mm\textsuperscript{2}

The Memory consists of 8 Blocks of 512 x 9 bits.
Each Block is composed by 4 Columns of 128 X 9 bits.
Functional tests

- Max operating frequency:
  - Simultaneous Read/Write operations: 70MHz @ 2.5V
- Read access time: 7.5ns @ 2.5V
- Power dissipation:
  - 15µW / MHz @ 2.5V for simultaneous Read/Write operations on the same clock cycle (0.60mW @ 40MHz).

- Tests for process variations:
  - Differences in the access time < 1ns for: -3σ, -1.5σ, typ, +1.5σ, +3σ
Performance Tests

- Test Chip: 4Kword X 9bits
- Operation Frequency: 50MHz
- Power Supply: 2.5Volts
- Read Access Time: 7.5nsec
Performance Tests

- **Test Chip:** 4Kword X 9bits
- **Power Supply:** 2.0 - 2.7Volts
- **Operation Frequency:** 50MHz
- **Test Patterns:**
  - All 1’s and all 0’s
  - Checkerboard
  - Marching 1’s
  - Marching 0’s

**Schmoo Plot**

Access Time (nsec) vs. Power Supply Voltage (V)

Pass
Power dissipation

Power dissipation of macro-cell.

Test chip: 4Kwords x 9bits

<table>
<thead>
<tr>
<th>Operation</th>
<th>Power (µW/MHz)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Standby</td>
<td>0.10</td>
</tr>
<tr>
<td>Idle</td>
<td>1.90</td>
</tr>
<tr>
<td>Read</td>
<td>7.40</td>
</tr>
<tr>
<td>Write</td>
<td>10.60</td>
</tr>
<tr>
<td>Read/Write</td>
<td>14.05</td>
</tr>
</tbody>
</table>

Test Conditions

<table>
<thead>
<tr>
<th>Operation</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>Standby</td>
<td>No operation, addr. &amp; data static.</td>
</tr>
<tr>
<td>Idle</td>
<td>No operation, addr. &amp; data changing in every clk cycle</td>
</tr>
<tr>
<td>Read</td>
<td>checkerboard data pattern</td>
</tr>
<tr>
<td>Write</td>
<td>checkerboard data pattern</td>
</tr>
<tr>
<td>Read/Write</td>
<td>checkerboard data pattern</td>
</tr>
</tbody>
</table>
Irradiation Tests

Ionizing Total Dose

- Conditions
  - Source: X-rays.
  - Step Irradiation: 1 Mrad, 5 Mrad, 10 Mrad.
  - Constant dose rate: 21.2 Krad/min.
  - Annealing: 24h @ ~25 °C.
  - Under bias, in Standby mode during irradiation & annealing.

- Results
  - No increase in power dissipation.
  - No measurable degradation in performance.

Single Event Upset:

- Under preparation

Test chip: 4Kwords x 9bit
CERN SRAM popularity!

- **ATLAS MCC chip**
  - Memory configuration: 128 x 27bit
  - Detector: ATLAS PIXEL
  - Lab: INFN Genova

- **ALICE AMBRA chip**
  - Memory configuration: 16K x 9 bits
  - Detector: ALICE Silicon Drift Det.
  - Lab: INFN Torino

- **ALICE CARLOS chip**
  - Memory configuration: 256 x 9 bits
  - Detector: ALICE Silicon Drift Det.
  - Lab: INFN Bologna

- **LHCb SYNC chip**
  - Memory configuration: 256 x 9 bits
  - Detector: LHCb muon system
  - Lab: INFN Cagliari

- **ATLAS SCAC chip**
  - Memory configuration: 128 x 18bit
  - Detector: ATLAS tracker
  - Lab: NEVIS Labs

- **ATLAS DTMROC chip**
  - Memory configuration: 128 x 153 bits
  - Detector: ATLAS TRT
  - Lab: CERN

- **CMS Kchip**
  - Memory configuration: 2K x 18 bits
  - Detector: CMS Preshower
  - Lab: CERN

Chips submitted and tested
Design Support

- Delivery of SRAM design library
- Half a day “design course” @ CERN
- Designer configures his macrocell
- Review the macrocell design
Conclusions

Design Status
- Design meets target specifications.
- Macrocell has been successfully used in a number of ASIC designs.

Future Plans
- No further development is foreseen.

Design Support
- Contact Person: Kostas.Kloukinas@CERN.ch

Information on the Web
- http://home.cern.ch/kkloukin
Floorplanning

- Row Decoder (128 rows)
- Column
- Block
- Write drv
- DIn Reg
- WA
- DIn
- RA
- DOut Latch
- Address Reg
- Timing
- CLK
- W
- R
- DOut
- WorldLine Buffers & Block Decoders
- vdd
- gnd

- Block
- Column
- Write drv
- DIn Reg
- WA
- DIn
- RA
- DOut Latch
- Address Reg
- Timing
- CLK
- W
- R
- DOut
- vdd
- gnd

- Row Decoder (128 rows)
- Block
- Column
- Write drv
- DIn Reg
- WA
- DIn
- RA
- DOut Latch
- Address Reg
- Timing
- CLK
- W
- R
- DOut
- vdd
- gnd