We collected transcripts from
CNN and the
American Presidency Project at UCSB, categorized them by hand, then ranked lemmatized word-phrases (or n-grams) by their frequency of use. Word-phrases can be made of up to five words. Our ranking algorithm accounts for things such as exclusive word-phrases - meaning, it won't count "United States" twice if it's used in a higher n-gram such as "President of the United States."