Red-Green-Code

Deliberate practice techniques for software developers

  • Home
  • About
  • Contact
  • Project 462
  • CP FAQ
  • Newsletter

CPFAQ: Defining Competitive Programming Terms

By Duncan Smith Oct 10 0

Dictionary

I’m working on a project this year to build a competitive programming FAQ. This is one in a series of articles describing the research, writing, and tool creation process. To read the whole series, see my CPFAQ category page.

It would be useful to have a page in the FAQ for a glossary of competitive programming terms. The Q&A part of the FAQ and the associated wiki discuss terms in detail, but a glossary provides an easy way to look up short definitions of terms that appear in questions and answers. This week, I started to collect a list of terms.

Collecting Terms

One approach to building a competitive programming glossary is to look at the words that people use most often when discussing the topic, filter out simple words that everyone already knows, and define the rest. As a source of words that people are using, I used my master list of question titles. Terms appear in the question title, the question body, and the answer body. But using the question title reduces the quantity of text to analyze, and having a term appear in a title is a signal that it’s important to the question being analyzed.

So I started with a text file containing my current question list. I wrote a simple console app to read it line by line and extract the words as follows:

  • Split the line at each space, and also at each period, question mark, comma, each Unicode character from last week, and quite a few others. This is a quick and dirty way to extract individual words while ignoring non-essential punctuation.
  • Convert each word to lowercase and trim leading and trailing whitespace.
  • Add each word to a (string, int) dictionary, where string is the word and int is a count of how many times it appears in the question list.

I then made a pass through the dictionary, and merged singular and plural forms of words, using a simplistic process: for each word in the dictionary, append an s and check if that new word also appears in the dictionary. This algorithm is far from perfect, but it helps consolidate words that refer to the same thing (e.g., programmer and programmers).

The proper way to extract words from text is to use something like the Natural Language Toolkit, but the process I used is easy to implement and good enough to give me a list of candidate words for the glossary.

Goal

The purpose of the glossary is to provide a short definition of words as people use them in competitive programming. Some popular terms from the list, like CodeChef, TopCoder, and SPOJ, are only meaningful in that context. But others, like problem, contest, and ACM, have specific meanings when applied to competitive programming, which might differ from their general meaning.

Implementation

CPFAQ uses the MediaWiki software. So the wiki markup that renders the glossary of terms commonly used on Wikipedia page can be adapted for the CP glossary. Some useful features of this page:

  • Multiple Table of Contents sections are included throughout the page to facilitate navigation.
  • Titles of glossary entries can be links.
  • There’s no arbitrary limit on the size of glossary entries.
  • The whole glossary is on one page, which allows Ctrl-F searching.

(Image credit: Dave Worley)

Categories: CPFAQ

Prev
Next

Stay in the Know

I'm trying out the latest learning techniques on software development concepts, and writing about what works best. Sound interesting? Subscribe to my free newsletter to keep up to date. Learn More
Unsubscribing is easy, and I'll keep your email address private.

Getting Started

Are you new here? Check out my review posts for a tour of the archives:

  • 2023 in Review: 50 LeetCode Tips
  • 2022 in Review: Content Bots
  • 2021 in Review: Thoughts on Solving Programming Puzzles
  • Lessons from the 2020 LeetCode Monthly Challenges
  • 2019 in Review
  • Competitive Programming Frequently Asked Questions: 2018 In Review
  • What I Learned Working On Time Tortoise in 2017
  • 2016 in Review
  • 2015 in Review
  • 2015 Summer Review

Archives

Recent Posts

  • Do Coding Bots Mean the End of Coding Interviews? December 31, 2024
  • Another Project for 2024 May 8, 2024
  • Dynamic Programming Wrap-Up May 1, 2024
  • LeetCode 91: Decode Ways April 24, 2024
  • LeetCode 70: Climbing Stairs April 17, 2024
  • LeetCode 221: Maximal Square April 10, 2024
  • Using Dynamic Programming for Maximum Product Subarray April 3, 2024
  • LeetCode 62: Unique Paths March 27, 2024
  • LeetCode 416: Partition Equal Subset Sum March 20, 2024
  • LeetCode 1143: Longest Common Subsequence March 13, 2024
Red-Green-Code
  • Home
  • About
  • Contact
  • Project 462
  • CP FAQ
  • Newsletter
Copyright © 2025 Duncan Smith