I remember! I also believe I said (and if I didn't, I'm saying it now) that it is almost impossible for anyone, even a developer, that is not familiar with a code base to guess the difficulty of implementing something. Sometimes something is indeed simple, sometimes it is incredibly different - in this case, the way the sentences are stored in the database makes this rather difficult. Taking a look at jisho.org, they have pretty "flat" data for their sentences.
1. Japanese sentence
2. Furigana
That appears to be it. I do not "think" they even have a "this word is listed in these sentences" - it looks like it is string matching, which is why you can search for 高層ビ and get results, even though that is not a term.
These is nothing wrong with this approach, of course. It's simple, fast, easy to update. However, there is a TON of data that renshuu builds and maintains for any given sentence, including
1. A list of every word in that sentence, and its conjugation if available (so you can get those nice boxes when you mouse over each word, add them to schedules, block from studying, etc.)
2. Kanji markers embedded in every word to make it easier for renshuu to grab a whole bunch of sentences and very quickly compare it to your kanji knowledge and settings and provide an appropriate display (all kanji, mixed kanji/hiragana, all hiragana, furigana)
3. difficult ratings on each sentence based on frequency of contained words so it can better sort some sentences to the top that appear to be easier to understand for users
4. Conjugation markers on every word
5. Various cache tables to make it easier to get the data to you quickly.
This list of four requires a lot of different database fields/tables.
I hope this better illustrates some of the craziness that lives behind each sentence on the site! It gives the site a lot more power and flexibility for each user to customize it as they like, but makes it much more difficult to add things that may appear to be simple.