Respond to the prompts below by interacting with https://bootstrapworld.org/Soekia/ using the
Intelligent Monkeys? collection.
Exploring the N-Grams Panel
The orange N-grams panel is where Soekia lists possible n-grams and how frequently they occur in the training corpus. The default setting (3) displays a list of every trigram. Clicking on the other numbers at the top will display lists of n-grams of other lengths.
1 Hover your mouse over the 3. How many different trigrams are there in this collection? How does that compare to the number of n-grams of other lengths?
2 The most common trigram appears at the top of the list. Click on it. What do you learn?
3 Click on the 5 tab. Notice that all of the 5-word n-grams occur equally often. Can you explain why this might be?
4 Take a minute to explore the N-grams Panel. What do you Notice? What do you Wonder?
Predicting the Next Word using N-Grams
For this section, make sure you are in the
Intelligent Monkeys? collection. Go to the Suggested words panel, click on "Customize temperature/ number of suggestions", and set the temperature to low.
Without introducing any randomization into the algorithm, Soekia generates text by selecting words one at a time from the most-frequent valid n-gram of the highest order available.
5 For the first "word", Soekia looks in the 1 tab to find the most frequently occurring unigram. What do you expect it to choose?
6 To choose the second "word", Soekia:
-
Looks at the 2 tab to find the most frequently occurring bigram that begins with the first "word".
-
If there isn’t one, it will return to the 1 tab and select the next most popular unigram.
What do you expect Soekia to choose? Which list did you select it from?
7 Why do you think there weren’t any bigrams that began with the most popular "word"? Hint: Read the documents closely!
8 To choose the third "word", Soekia:
-
Looks at the 3 tab to find the most frequently occurring trigram that begins with the first and second "words".
-
If there isn’t one, it will look in the 2 tab for the most frequently occurring bigram beginning with the second "word".
-
If there isn’t one, it will return to the 1 tab and select the next most popular unigram.
What do you expect Soekia to choose? Which list did you select it from?
9 Continuing this process, what do you expect Soekia to choose for the:
fourth "word"? fifth "word"? sixth "word"?
Testing our Prediction
For this section, make sure you are still in the
Intelligent Monkeys? collection with the temperature set to low.
10 How does Soekia answer the question How intelligent are monkeys?… when you click
11 How does that text compare to your prediction?
These materials were developed partly through support of the National Science Foundation, (awards 1042210, 1535276, 1648684, 1738598, 2031479, and 1501927).
Bootstrap by the Bootstrap Community is licensed under a Creative Commons 4.0 Unported License. This license does not grant permission to run training or professional development. Offering training or professional development with materials substantially derived from Bootstrap must be approved in writing by a Bootstrap Director. Permissions beyond the scope of this license, such as to run training, may be available by contacting contact@BootstrapWorld.org.