• The Air Traffic Control Corpus (ATC)

    (Held by LDC) The ATC Corpus consists on seventy hours of recorded conversation between controlers and aircrafts in three major airports of the United States. It is divided into three subcorpora corresponding to each one of the three airports:

    Each one of them consists of 20-25 hours of data, representing continuous recording without silence elimination. The speech files are fully transcribed, with time marking indicating beginning and end of transmission.

    (external link)