Overview
The problem: Research in social sciences often include qualitative studies of things people have said or written. One way to visualize this is a word cloud, where words in the text are sized according to their frequency, like this:
This word cloud shows every article from November of 2025 from The Verge categorized as being about AI.
But a single word cloud doesn’t let multiple researchers work on multiple texts and compare those texts easily. Your job is to build a way for researchers to share word clouds and meaningfully compare them.
There’s a catch of course - this is UH Mānoa so some of the articles are in English and some are in ʻŌlelo Hawaiʻi.
The solution: Create a website that allows researchers to upload word cloud data and share it with other researchers.
Word cloud data should include:
- A list of words and frequencies
- A bibliography of where the source text was from
- A sample image of the word cloud
- The language used by the source text
- The researcher at UH Mānoa who loaded the data
- The original source text (if available)
When sharing word cloud data, researchers should be able to meaningfully compare one word cloud data set with another.
Mockup page ideas
Some possible mockup pages include:
- Home-Page
- Login/Register
- Profile
- Word cloud data CRUD pages (Create, Read, Update, Delete)
- Search options to see other researcher’s word cloud data sets.
- Comparison of two word cloud data sets.
Use case ideas
Whether or not the following bullet points list all pages or not, the completed use case should show an end-to-end scenario of using the system.
- A researcher uploads a new word cloud data set and makes it available to other researchers.
- A researcher searches for word clouds posted by other researchers that may be relevant to their own work.
- A researcher leaves comments for another researcher on their research.
- An administrator reviews word cloud data for appropriateness, including but not limited to copyright issues, illegal content, or offensiveness.
Beyond the basics
After implementing the basic functionality, here are ideas for more advanced features:
- Create word clouds from scratch, given a complete text. Note that this must include the ability to not include certain words, and the ability to count similar words as the same (such as plurals or gerunds).
- Show multiple word clouds on a page at once, with matching words aligned.
- Incorporate time data for word cloud data, i.e., when the text for the word cloud was collected. Include ways to show time information.
- Incorporate place data for word cloud data, i.e., where the text was collected. Include ways to show this place information, as with a map.
- Integrate Word Cloud Party with existing qualitative research tools, like NVivo.
- Integrate Word Cloud Party with existing public domain data, like Project Gutenberg.
- Integrate Word Cloud Party with an existing large language model (LLM) tool, like Gemini. Determine what uses would be appropriate, and write an essay on that in addition to the integration itself. Consider the use of data annotators reported here in this investigation. Does this affect the qualitative research being done using Word Cloud Party?
Be careful about loading data from other websites - down that path there are large spiders, who might find you crunchy and tasty with ketchup or chili pepper water.
Brook Conner dbconner