Context collapse is a defining characteristic of life online. Sometimes the results are funny, sometimes they’re bad.
I decided to embrace this context-free world by fine-tuning a language model (GPT-2 medium) devoid of contextual clues to detect jokes (or rather certain kinds of jokes told by certain kinds of people). I came up with a list (list A) of highly ironic Twitter users and parody accounts and a list (list B) of highly earnest, straight shooting accounts and used the ham-fisted heuristic that all of the tweets in list A are jokes and all of the tweets in list B are not to fine-tune the model. Too stupid to work, right?
So far, yes. The results are probably about what you would expect. I did successfully build a classifier that can distinguish between the two groups as the model has an accuracy of around 94% but humor seems like only a mid-tier contingency for this split. While the model can identify news-level seriousness, because of all the ironic tweets and the fact that list A tended to make small talk more than list B, the model labels certain banal statements like “Hello, what is your name?” as jokes. Additionally I haven’t done extensive enough transfer testing but since humor doesn’t scale I wouldn’t expect the model to transfer well to any kind of humor it didn’t see in list A.
I’m still fine-tuning and tweaking the heuristic to be less ham-fisted and I expect to get improved results soon enough. But I can’t do it alone and I’m hoping people will test it out*.
*UPDATE (4/20/2021): The downside to deploying a large language model as cheaply as possible is that the site will crash with just modest usage or go down because it’s a preemptible instance. If the link to the website doesn’t work and you’d really like to see it, try again later. I’ll keep putting the website back up.
UPDATE (4/5/2021): If you want to know more about how this website was made and how you can make one for yourself, see this post.