The First Integration Test of the ATLAS End-cap Muon Level 1 Trigger System


Abstract-- A slice test system has been constructed for the ATLAS end-cap muon level-1 trigger. ATLAS is one of the four Large Hadron Collider (LHC) experiment. Although the system has been constructed using prototype ASICs and electronics modules, the design scheme of the trigger, readout as well as control logic applied to the system is the final one. The size is about 1/300 of the whole number of channels. The purpose of the slice test is to demonstrate the system design and performance in detail prior to production commitment. In this paper we discuss the validity of the logic through the comparison of the simulation results, the latency measurement and long run tests.

I. INTRODUCTION

After submitting the ATLAS trigger design report for the level 1 (LVL1) muon end-cap system [1], we have concentrated on developing the custom ICs to be used for the system. Recently we have nearly completed the prototype fabrication of three main ASICs, although we intend to use four more small ASICs in the final system. Before moving into the final phase of the main IC production, we have built a slice test (SLIT) system using the developed ASICs in order to investigate the design validity and performance of the final system.

The LVL1 end-cap muon trigger system receives amplified, shaped and discriminated signals of Thin Gap Chamber (TGC), which detects muons in the end-cap region of both sides of ATLAS at 40MHz rate and supplies muon trigger candidates to the ATLAS muon central trigger processor (MUCTPI). TGC measures \( r \) (with anode wires) and \( \phi \) (with cathode strips) coordinates for a track. The present SLIT system reflects the same structure as is anticipated for the complete system design. Although the number of channels is kept small, the SLIT system performs all the logics necessary for the trigger decision, readout and control operations.

In order to perform the SLIT, we have developed independently a C++ based trigger simulation code. The code has been developed to imitate the hardware system as much as possible even in detail of its logic, structure and connection. Thus if we find some discrepancy of results between the simulation and the SLIT system, we can spot immediately the source of the inconsistency in the system.

The first integration test of the full chain SLIT has been started at KEK in September, 2001 with only wire signal processing. In 2002, strip signal processing has been added to make \( r-\phi \) coincidence check. The software framework that controls the hardware initialization and run control has been also changed from 2001 to 2002.

In this report, we firstly discuss overall TGC electronics system in the next section. We then discuss in section III the components (modules and ASICs) used in the SLIT system. The test results are presented in section IV, and we describe the summary got by SLIT in section V.

II. OVERVIEW OF TGC TRIGGER SYSTEM

A. Trigger schemes

The signals for the LVL1 endcap muon trigger come from the Thin Gap Chambers (TGC), which covers \( 1.05 \leq \eta \leq 2.70 \), and are situated at around \( z = \pm 14\text{m} \) from the interaction point (IP). The chambers give seven measurement layers on each side, grouped into three discs labeled, in order from the closest to the IP, M1 (triplet), M2 (doublet), and M3 (doublet). In front of TGC, there are air-core toroidal magnets that produce the magnetic fields for muon detection. By measuring the deviation of the detected points from ones to be recorded by a straight line interpolated from IP to the point detected in the outermost layer, we can estimate the curvature of a particle.

...
Two different lever arms from M3 to M1 and M2 provide different measurements of $p_T$ range. Low-$p_T$ muon tracks ($6 \leq p_T < 20$ GeV) are identified independently using signals from M1 alone (three layers), and from M2 and M3 in combination. Then the signals identified with M2 and M3 and ones from M1 are combined to identify high-$p_T$ tracks ($p_T > 20$ GeV). Results of the independent signal processes for both $r$ and $\phi$ are unified eventually and muon tracks identified in an $r$-$\phi$ coincidence matrix will be used for the final LVL1 decision.

**B. Electronics System**

Fig. 1 summarizes a structure of the TGC electronics system [2]. Electronics components are divided into two parts: on-detector and off-detector parts. The on-detector part is further separated into two parts, one is called Patch-panel and Slave (PS) board that is installed just behind the detector and the other one is Hi-$p_T$ and Star Switch (HS) crate that is installed at the outer rim of M1.

Digitized signals from Amplifier- Shaper- Discriminator (ASD) boards [3] attached directly to TGC are inputted to Slave Board ASICs (SLB IC) after synchronized and bunch crossing identified in Patch-panel (PP) ASICs (PP IC). SLB performs local coincidence to identify muon tracks coming from the interaction point with $p_T \geq 6$ GeV/c, and output information of $r$, $\phi$ and $\Delta r$, $\Delta \phi$ for every muon candidate (low $p_T$ coincidence with M1, or M2 and M3). The PP and SLB ICs are mounted together on a PS board. The output signals of SLB are fed into a Hi-$p_T$ board, which is installed in an HS crate and is approximately 15m away from the corresponding PS board. The Hi-$p_T$ board contains Hi-$p_T$ ASICs (HpT IC). An HpT IC combines information from two (for doublet) to three (for triplet) SLB ICs to make a global coincidence to find muon tracks with $p_T \geq 20$ GeV/c (Hi-$p_T$ coincidence). HpT IC also makes data compression to send its output over about 90m distance with serial data transmission.

Signals for $r$ (wire hit information of TGC) and ones for $\phi$ (strip) are separately processed in the independent streams up to hi-$p_T$ coincidence operation, and the sector logic (SL) installed in the off-detector part combines these two streams and makes a coincidence in $r$-$\phi$ to identify muon signals in two dimensional space. At maximum two highest $p_T$ muon candidates per trigger sector (72 sectors/side) are selected after successful $r$-$\phi$ coincidence, and the information is sent to the MUCTPI. Functionalities and design concept of the three main ASICs (PP, SLB and HpT) have been discussed in detail in [4].

Since hit information for both coordinates will be used not only for the trigger decision logic but also for the second coordinate information for the ATLAS muon reconstruction in offline analyses, a readout system must be implemented. Readout data are processed also in SLB ICs, each of which implements pipeline buffers during the LVL1 processing time and FIFO for selected events (de-randomizer). At every LVL1 accept (L1A) signal, data are serialized in SLB ICs and sent to a data distributor/concentrator, which is so called Star Switch (SSW). One SSW has 18 SLB IC inputs and one output. A sequential process of receiving data from SLB, storing in FIFO, format analysis and output to Readout Driver will be done for all used channels in turn under the VME control. Basic concepts and functionalities of SSW will be found in [5].

Readout Driver (ROD) receives data from total 13 SSWs. Data received are stored in FIFO, which is prepared for every input channel. All the data stored in FIFOs are merged if these data have an identical L1A identification number. ROD sends them to the ATLAS central DAQ facility. The data gathering from 13 input FIFO buffers to form one fragment needs an onboard microprocessor. ROD used in the present SLIT implements 32bit RISC chip of Hitachi SH4. This chip is used for arbitration of an internal bus connected with it and both input and output FIFO buffers. ROD sends data to the ATLAS central DAQ facility in the end.

On-detector parts will be installed in radiation critical region so that modules (PS board and HS crate) will suffer from single event upsets (SEU), although the total dose irradiated will not be serious as estimated as 7 to 10krad maximum for 10 years LHC operation with its highest luminosity. We anticipate the hadronic radiation levels of $2.11 \times 10^{16}$ h/cm$^2$/10yr and $1.42 \times 10^{16}$ h/cm$^2$/10yr at a PS board and an HS crate respectively. Since every module mounts a few FPGA, CPLD or ASIC chips, we have to monitor such an occurrence always during experiment, and restore correct bitmap data as soon as possible if SEU is detected in a reconfigurable chip. HS crate controller (HSC) and Crate controller Interface (CCI) are introduced for this purpose. A pair of HSC and CCI makes VME operation for HS crate control. CCI is put in a crate in the counting room (USA15). Its counterpart HSC module is a remote crate controller at HS crate so that CCI can control an HS crate as if this is own crate. A CCI always monitors the partner HSC. If the HSC reports an SEU in its controlled electronics modules, the CCI can send data with JTAG protocol over VME bus in order to restore FPGA or ASIC configuration promptly [6].

LVDS serial link is used between HpT module and PS board, and also between SSW and PS board (15 m). Serialized data link by Agilent chipset (HDMP-1032/1034) [7] (G-link) with optical fibers (90 m) are used between HpT and SL, HSC and CCI, and SSW and ROD.

**III. SETUP FOR SLICE TEST**

For the SLIT system, we use two PS boards for wire and strip signal processing. Numbers of channels are 256 and 192, which correspond to 1/400 and 1/250 scales of total channels for wire and strip, respectively.

Overall connection diagram of the SLIT system is shown in Fig. 2. Full specification design has been applied to almost all
the electronics modules and three main ASICs. Functionalities of four other miscellaneous ASICs are substituted by FPGAs in the present setup. As actual TGC system, the SLIT system is also separated into three sections. The third and last columns in Fig.2 are installed in two independent VME crates while the components in the second column, which are service board (Service Patch Panel; SPP, used for fan-out of timing signals) and PS boards, are put isolated on a test bed and are powered independently. Pulse pattern generators (PPG) and interrupt register module (INT) (VME modules of REPIC co. ltd. [8] RPV-070 and RPV-130 for PPG and INT) are used to emulate the ASD output binary signals. INT is used to synchronize the output timing of over 14 PPG modules. PPG and INT are installed in further different VME crate than two previous ones. Hence we have used total three VME crates in the present SLIT setup. The trigger output data together with the input patterns are stored in a VME FIFO module (GNV180 of Gnomes Design co. ltd. [9]). The comparison with the simulation data is then performed. Also for the readout case, the output data of ROD are sent to the VME FIFO-module.

ROD and PPG crates have own independent PC running Linux system while control for the modules stored in HS crate are done remotely by PC connected to ROD crate through HSC-CCI chain. We introduce a PC called root PC as the run control master. We have thus three PCs in the system. We have made a software system on top of the ATLAS online software framework [10]. As distributed nature of the system is incorporated in this software system, PCs for PPG and ROD crates do not care which modules are installed in its own crate. The configuration database used in this application contains all relationships between a module and its corresponding crate, and so allows the PC having access to each module to be identified. If an access to a particular module in the system is required, a root controller immediately recognizes the corresponding PC by referring the database, and asks the relevant PC to access the module. This process is carried out by the information server embedded in the run control system, which is a part of the ATLAS online software. The detailed discussion of the software frameworks for the SLIT system is found in [11].

IV. TEST RESULTS

A. Trigger Part

We have done the trigger part verification in the SLIT system by inputting hit patterns and comparing the outputs with the one of the simulation. In the simulation, we have set infinite muon $p_T$. The relevant modules for this trigger functionality check are PS board, HpT and SL modules. All the principal ASICs (PP, SLB and HpT) are involved.

The data links between modules are LVDS serial link with 7m between PS board and HpT module, and G-link with optical fibers with 10 m between HpT and SL while actual ATLAS experiment will use 15m cable and 90m optical fiber for these parts respectively. The transmission test of cables as well as optical fibers with various lengths has been independently tested prior to SLIT. As we found signal delay in both cable and fiber was proportional to only the cable length, we decided to use shorter cables in SLIT by economical reason, and calculate the actual delay time in the ATLAS experiment using the measured proportional constants for the estimation of latency as discussed below.

We have supplied hit patterns generated by VME PPG modules with 40MHz external trigger rate, and found no discrepancy in output comparison with more than 15000 patterns for each of one-, two-, and ≥3 tracks cases. We have repeated the comparison continuously more than one hour under normal 40MHz trigger condition, and found no problem. Latencies measured in the main components are listed in Table I. Total latency can be calculated with the measured values listed in this table together with the cable delay estimation. The latency after MUCTPI to individual sub-detector front-end has been estimated in [1] as 800ns. As the latency up to MUCTPI (namely without the central trigger processing and propagation time) is estimated as 1.21µs, the total latency will be estimated as 2.01µs. These measured values are well below the rates of 1.25 and 2.05µs, which are pre-assigned in TDR.

<table>
<thead>
<tr>
<th>Components</th>
<th>Measured Delay (ns)</th>
<th>Bench Mark (TDR) (ns)</th>
</tr>
</thead>
<tbody>
<tr>
<td>TGC,ASD*</td>
<td>160</td>
<td>175</td>
</tr>
<tr>
<td>PP IC</td>
<td>43</td>
<td>50</td>
</tr>
<tr>
<td>SLB IC</td>
<td>49</td>
<td>75</td>
</tr>
<tr>
<td>LVDS Tx,Rx</td>
<td>83</td>
<td>75</td>
</tr>
<tr>
<td>Cable (15m)*</td>
<td>75</td>
<td>75</td>
</tr>
<tr>
<td>HpT IC</td>
<td>55</td>
<td>75</td>
</tr>
<tr>
<td>G-link Tx,Rx</td>
<td>105</td>
<td>75</td>
</tr>
<tr>
<td>Fiber (90m)*</td>
<td>450</td>
<td>450</td>
</tr>
<tr>
<td>SL</td>
<td>160</td>
<td>175</td>
</tr>
<tr>
<td>Cable (5m)*</td>
<td>25</td>
<td>25</td>
</tr>
<tr>
<td>Total</td>
<td>1205</td>
<td>1250</td>
</tr>
</tbody>
</table>

Values with * are estimated ones.

B. Readout Part

Relevant electronics components for the readout test are PS board (PP and SLB ASICs), SSW and ROD modules. The data links used are also LVDS serial link with 7m from SLB to SSW and G-link optical fiber of 10m from SSW to ROD. The same input patterns for the trigger test have been used for the readout test.

Goals of the readout test are to check matching of input data to PPG with ones read out by ROD, to measure the
bandwidth of individual modules and functionalities of the modules.

SSW has been delivered to us in summer 2002. While we have tested an integrated readout test from PS board to ROD using an alternative module for SSW in 2001, we have not yet done with new SSW in 2002 although a stand-alone test and debug for SSW has been completed.

Since SSW needs at least eight LHC clocks periods at 40MHz frequency for the whole sequential process of one channel, and one SSW has 18 input channels, 144 clocks/event will be taken if all channels contain no-hit data. About 160 clocks will be needed if the occupancy is about 4%, although the actual occupancy is estimated as at most 1%. The LVL1 trigger rate is estimated as 100kHz. Namely 40 clocks are allocated for the readout process for one L1A event. SSW will not be a bottleneck unless the occupancy is accidentally in excess of 20% (SLB will spend 218 clocks for data readout of one L1A).

For ROD, we have developed the kernel program for onboard RISC chip. The kernel program reads data stored in total 13 input FIFOs from SSW and two another input FIFOs for timing information of L1A, merges and sorts them according to the bunch crossing ID. Dataset stored in an input FIFO has the header and trailer. The kernel program should remove these words, and extract the relevant information before rearrangement. Then it finally pushes data on the readout FIFO, and sends them towards the VME FIFO. The ROD internal bus connects the input and readout FIFOs, the SH4 processor and its ancillary SDRAM, which is used as a cache during data reformatting. The bandwidth of this bus is critical for operation at 100kHz L1A rate. We have measured this bandwidth in a stand-alone benchmark test. The bus access time is measured as 200ns for four bytes (data length of no hit) per one FIFO. Total access time with 13 plus two FIFOs inputs with a readout FIFO including SH4 software access time is then estimated as 14 (no hit at all) to 30µs (hits in all the channels) since one input FIFO will store data fragmentation of 40byte length if corresponding input channel has hit information. It must be hard to keep processing under 100kHz L1A trigger if the bus access time exceeds 10µs.

V. SUMMARY

For two years, we have made the slice tests for the TGC electronics system. To make the tests enable, we have also developed various software codes, which are the trigger simulation, GUI based run control and initialization software, various database modules, although some which could not be discussed in detail in this report.

We have confirmed the trigger logic validity independently for r (wire)- and φ (strip)- parts and also one for r-φ coincidence installed in the sector logic. We have measured latency for the level 1 trigger signal generation, and found that less latency is needed (1.21µs) than estimated in [1] (1.25µs).

While the individual modules have been tested and estimated their performance, the whole integrated test for the readout part has not been finished.

For the individual modules, although we have found SSW worked fine, we have found some deficiencies in the design of ROD. For using full 13 input FIFOs and if all the channels contain the data, about 30 µs is necessary for one L1A data processing in ROD. We can process data with only 30kHz in ROD instead of nominal 100KHz rate unless we devise to increase the bandwidth of the internal bus in ROD or we should bring more ROD modules to share the task.

VI. ACKNOWLEDGMENT

This work was done in Japanese contribution framework of the ATLAS experimental project. We are grateful to all the members of Japanese ATLAS TGC construction group and KEK electronics group. We would like to express our gratitude to Prof. Taka Kondo of KEK for his support and encouragement.

VII. REFERENCES

Fig. 1 Block Diagram of TGC electronics system
Fig. 2 Slice test setup for TGC electronics