CS 106 Winter 2017

Lab 08: Text Processing


Question 1 Word Play

In this exercise, you will practice processing text by solving more word puzzles like the ones demonstrated in class. These puzzles involve iterating over an array containing all the words in some long list, finding the ones that pass some sort of criterion or test. You will be given a sketch that has a user interface for displaying the solutions to these puzzles; your job is to write the tests themselves.

  1. Download the starter code and unzip it. Open the WordPlay sketch. You will see that the sketch is divided into two tabs. The WordPlay tab sets up the sketch, loads in the word list, and builds the user interface. You do not need to change anything in that tab, apart from adding your name and student ID to the top. Everything else you do will happen in the Puzzles tab.

  2. There are six puzzles to solve. They are solved by calling the functions getWords_0() ... getWords_5(). Each function must return an array of Strings containing all the words that solve the puzzle. (You do not need to call these six functions yourself—they are called for you by the code in the WordPlay tab.) You will find that getWords_0() is solved for you as an example. You must add code to the other five functions to solve those puzzles (look for comments marked TODO.

  3. In getWords_1(), write code to find all words that start with "und" and end with "und". Example: "underground".

  4. In getWords_2(), write code to find all words that contain three double letters in a row. Example: "bookkeeper", which contains "oo", "kk" and "ee", with nothing else in between. (Obviously, these words must all have at least six letters!) For this puzzle, you'll have to write an inner for loop that iterates over the letters in each word, extracting individual characters using the String class's charAt() method. Note that you can use plain == on two char values to determine whether they're equal.

  5. In getWords_3(), write code to find all words of six or more letters in which the letters are in strict alphabetical order within the word, with no repeats. Example: "almost", because "a" comes before "l", "l" comes before "m", and so on. As above, you'll need an inner loop, here comparing each letter to the next one in the word. Note that you can use < and > on two char values to determine whether one comes before or after the other in the alphabet.

  6. In getWords_4(), write code to find all words of 14 or more letters in which no single letter occurs more than once anywhere in the word. So, for example, "undiscoverable" doesn't count because it contains two "e"s, but "undiscoverably" words.

    There are a couple of different ways to solve this puzzle. The easiest is to first write a helper function that counts the number of times a given letter occurs in a word. That might start like this:

    int getCount( String word, char letter )
    {
      // Count how many times letter occurs in word
    }

    Now, in getWords_4(), first check that the current word has at least 14 letters. If so, check how many times each letter of the alphabet occurs in the word, by calling getCount(). If any letter returns a count of two or more, this word is invalid and should not be appended to the output array.

  7. In getWords_5(), write code to find all words in which the vowels "a", "e", "i", "o", "u" and "y" appear in the word in that order, with no other vowels. Example: "facetiously". The easiest way to solve this puzzle is with a regular expression. The good news is that the regular expression is provided for you. All you need to do is use it correctly. Use the built-in match() function with the current word and the regular expression. If match() returns a non-null value, then the pattern was found and the word is one of the solutions to the puzzle.

In case it's not obvious, your code must actually find the words that solve the puzzle. That is, you're not allowed to determine the solution words by some other means and return them explicitly in an array. Looked at another way, if we changed the file words.txt, your code would proceed to find new sets of solutions relative to the words in that new file.

Save your work in a sketch titled WordPlay.

Submission

When you are ready to submit, please follow these steps.

  1. If necessary, review the Code Style Guide and use Processing's built-in auto format tool. You do not need to use the precise coding style outlined in the guide, but whatever style you use, your code must be clear, concise, consistent, and commented.

  2. If necessary, review the How To Submit document for a reminder on how to submit to LEARN.

  3. Make sure to include a comment at the top of all source files containing your name and student ID number.

  4. Create a zip file called L08.zip containing the entire L08 folder and all its subfolders.

  5. Upload L08.zip to LEARN. Remember that you can (and should!) submit as many times as you like. That way, if there's a catastrophe, you and the course staff will still have access to a recent version of your code.

  6. If LEARN isn't working, and only if LEARN isn't working, please email your ZIP file to the course account (see the course home page for the address). In this case, you must mail your ZIP file before the deadline. Please use this only for emergencies, not "just in case". Submissions received after the deadline may receive feedback, but their marks will not count.