Image: “The Bridge in Emoji”. Made by Danielle Navarro using the emoji-mosaic tool (see here). Original image is The Bridge in Curve painted by Grace Cossington Smith in 1930.

Day 2: This is serious, mum - the emo package

Yesterday’s effort with blogdown has worn me down. I cannot handle another one of those. Plus, my five year old daughter wants to play a game with me. Time to kill two birds with one stone, by exploring…

THE EMO PACKAGE!!!

Okay! I am excited. This looks like 🎉. It looks like 🎂 and like 🎈. I feel like 💃. I take a quick look at some of the documentation, install the package from the github repository

devtools::install_github(repo = 'hadley/emo')

… and run this idea past my daughter. She informs me that my idea is 💩. I explained to her that it would be more polite to express her feelings using her code emo::ji("poop"), but she wasn’t having any of it.

Well, at least Hadley Wickham agrees with me 😁

The emo::jis tibble

Right. So it’s a couple of hours later. The kids have been wrangled, I popped out to get my nails done and my typing skills are restored. Better yet, my partner has disappeared with the kids for a bit, and I can take a look at the package! On a quick inspection, it seems like the main data structure the package works off must be the emo::jis tibble. Here’s some of the entries:

car::some(emo::jis)
## # A tibble: 10 x 21
##    runes qualified emoji name  group subgroup version points nrunes
##    <chr> <chr>     <chr> <chr> <chr> <chr>    <chr>   <list>  <int>
##  1 1F46… fully-qu… 👨🏾‍🏫    man … Smil… person-… 8.0     <int …      4
##  2 1F46… fully-qu… 👩🏿‍🚒    woma… Smil… person-… 8.0     <int …      4
##  3 1F57… fully-qu… 🕵🏾‍♀️    woma… Smil… person-… 8.0     <int …      5
##  4 1F9D… fully-qu… 🧛🏽    vamp… Smil… person-… 10.0    <int …      2
##  5 1F64… non-full… 🙍‍♂    man … Smil… person-… <NA>    <int …      3
##  6 1F6B… fully-qu… 🚶‍♀️    woma… Smil… person-… 6.0     <int …      4
##  7 1F6A… non-full… 🚣🏼‍♀    woma… Smil… person-… <NA>    <int …      4
##  8 1F416 fully-qu… 🐖    pig   Anim… animal-… 6.0     <int …      1
##  9 1F1E… fully-qu… 🇬🇧    Unit… Flags country… 6.0     <int …      2
## 10 1F1F… fully-qu… 🇻🇺    Vanu… Flags country… 6.0     <int …      2
## # ... with 12 more variables: vendor_apple <lgl>, vendor_google <lgl>,
## #   vendor_twitter <lgl>, vendor_one <lgl>, vendor_facebook <lgl>,
## #   vendor_messenger <lgl>, vendor_samsung <lgl>, vendor_windows <lgl>,
## #   keywords <list>, aliases <list>, unicode_version <dbl>,
## #   ios_version <dbl>

Yep, that looks like it has a lot of stuff. Very nice! But I’m too lazy today to work out all the particulars, so instead I’m just going to move on and give myself licence to do something very serious…

Towards EMO-TISM

… I’m going to revisit my youth. TISM (This Is Serious Mum) were a Melbourne based outfit who I was not anywhere near cool enough to have heard about until they became popular with their third album Machiavelli and the Four Seasons. My goal will be to take the lyrics to one of my favourite songs of theirs – Greg! The Stop Sign – and replace the text with matching emoji, thus creating EMO-TISM. Clearly the world has been awaiting this. Here are the lyrics:

lyrics <- readLines("https://djnavarro.net/files/gregthestopsign.txt")
cat(lyrics,sep = "\n")
## [Intro]
## Ba-ba-ba, ba-ba, ba-ba-ba-ba
## Ba-ba-ba, ba-ba, ba-ba-ba-ba
## Ba-ba-ba, ba-ba, ba-ba-ba-ba
## Ba-ba-ba, ba-ba, ba-ba-ba-ba
## 
## [Verse 1: Ron]
## The guy who slagged the football team, those yobs were not for him
## He turns into a real estate agent who believes in discipline
## The guy who's first to use cocaine, the wild boy breaking free
## He'll end up in a court of law as a prosecuting QC
## Remember the school captain? Success was a matter of time
## I can hear him now as she screams, "Greg, you missed the stop sign!"
## 
## [Verse 2: Ron]
## Forget Snoop Doggy Dogg, forget old Ice T
## The true word out on the streets is produced by the TAC
## What's the use of striving? As life's road in front unravels
## We get to do the driving, don't choose the direction we travel
## Do your homework or wag for weeks, graffiti the Dandenong line
## It don't matter much when you hear that scream, "Greg, you missed the stop sign!"
## 
## [Chorus]
## Greg! The stop sign!!
## Greg! The stop sign!!
## Greg! The stop sign!!
## Greg! The stop sign!!
## 
## [Verse 3: Ron]
## Sometime in the next 10,000 years a comet's gonna wipe out all trace of man
## I'm banking on it coming before my end of year exams
## The rich kid becomes a junkie, the poor kid an advertiser
## What a tragic waste of potential – being a junkie's not so good either
## Your folks worked hard for what you've got, you are the fruit of their vine
## But who cares what you sow when what you reap is, "Greg, you missed the stop sign!"
## 
## [Verse 4: Humphrey]
## Bought a car just the other day, man, could that baby run
## But you know what they always say, there's always a better one
## Got a tumour in my brain, it's creeping to my lungs
## And I've searched around in vain, can't find me a better one
## 
## [Verse 5: Ron]
## Hardwired into everyone's head is the person they're gonna be
## Growing up's not a matter of choice, it's a matter of wait and see
## So kids, yeah, you can do it, you can be your best
## Girls can do anything, you can pass the test
## I'm okay, you're okay, we're okay, we're fine
## I thought I heard a semi-trailer, "Greg, you missed the stop sign!"
## 
## [Chorus]
## Greg! The stop sign!!
## Greg! The stop sign!!
## Greg! The stop sign!!
## Greg! The stop sign!!
## Greg! The stop sign!!
## Greg! The stop sign!!
## Greg! The stop sign!!
## Greg! The stop siiiiiiggggggn!!

Let’s see if we can replace as much of this with emoji!

Manually doing a find and replace

One way to do this would be to manually decide which piece of text to replace. For instance, I might decide that "ba" should be replaced with emo::ji("music"). That’s easy enough to do without worrying about any serious pattern matching.

rawText <- "Ba-ba-ba, ba-ba, ba-ba-ba-ba"
emojifiedText <- gsub(
  pattern = "ba",
  replace = emo::ji("music"),
  x = rawText,
  fixed = TRUE
)

If I print this I can see the underlying unicode representation:

print(emojifiedText)
## [1] "Ba-\U0001f399-\U0001f399, \U0001f399-\U0001f399, \U0001f399-\U0001f399-\U0001f399-\U0001f399"

But what’s the point of that? This is emoji. One does not simply print emoji: you must cat it. Hey, last post I did say I’d do kittens and emoji right? Well, here you go:

cat(emojifiedText)
## Ba-🎙-🎙, 🎙-🎙, 🎙-🎙-🎙-🎙

So this has worked fine, but it only matches the "ba" and not the "Ba". That seems undesirable. I’d rather have it match against a regular expressions like [Bb]a. That sounds sensible, but I have sinking feeling in the pit of my stomach every time I have to use regular expressions. It’s been a little while, so maybe hold on for a moment while I google the regex documentation

… well that didn’t go well. I did briefly manage to document our conversation:

  • me: um yeah, hi, so i wanted to search for a text pattern?
  • REGEX: RUDIMENTARY CREATURES OF BLOOD AND FLESH. YOU TOUCH MY MIND, FUMBLING IN IGNORANCE, INCAPABLE OF UNDERSTANDING. THERE IS A REALM OF EXISTENCE SO FAR BEYOND YOUR OWN YOU CANNOT EVEN IMAGINE IT. … BEFORE US, YOU ARE NOTHING
  • me: right, right, noted, but about the pattern I was interested in?
  • REGEX: THE PATTERN HAS REPEATED ITSELF MORE TIMES THAN YOU CAN FATHOM
  • me: ok. i can see you’re busy. nvm

I think I’ll just muddle onwards…

rawText <- "Ba-ba-ba, ba-ba, ba-ba-ba-ba"
emojifiedText <- gsub(
  pattern = "[Bb]a",
  replace = emo::ji("music"),
  x = rawText
)
print(emojifiedText)
## [1] "\U0001f3bb-\U0001f3bb-\U0001f3bb, \U0001f3bb-\U0001f3bb, \U0001f3bb-\U0001f3bb-\U0001f3bb-\U0001f3bb"
cat(emojifiedText)
## 🎻-🎻-🎻, 🎻-🎻, 🎻-🎻-🎻-🎻

Yay! 🍷

Why are the emoji different?

Okay, one puzzle. Why are the music emojis different from each other? Every time I run this script I get a slightly different emoji. To illustrate this behaviour, I first tried this:

for(i in 1:20){
  cat(emo::ji("music"))
}
## 🎶🎷🎙🎵🎵🎻🎚🥁🎙🎸🎧🎧🎚🎶🎷🎶🎙🎵🎹🎚

There’s no particular pattern so I don’t think it’s doing anything systematic. My guess is that the reason this happens is that a string can match against multiple emojis – I presume that under the hood the function is doing some matching against the content in the emo::jis tibble (presumably using the keywords variable - which I guess I’d have known if I’d read the documentation before starting!). The particulars probably don’t matter for now. The main thing I guess is that a string like "music" could match several entries, and I’m assuming the emo::ji() function is selecting one at random. The emo::ji_find() function seems to bear out this speculation:

emo::ji_find("music")
## # A tibble: 14 x 2
##    name              emoji
##    <chr>             <chr>
##  1 musical_score     🎼   
##  2 musical_note      🎵   
##  3 notes             🎶   
##  4 notes             🎶   
##  5 studio_microphone 🎙   
##  6 level_slider      🎚   
##  7 control_knobs     🎛   
##  8 headphones        🎧   
##  9 saxophone         🎷   
## 10 guitar            🎸   
## 11 musical_keyboard  🎹   
## 12 trumpet           🎺   
## 13 violin            🎻   
## 14 drum              🥁

Neat!

Substitution at the word level:

Right, so let’s have a go at treating each word in “Greg! The stop sign!” as a keyword to pass to emo::ji(). The first thing I noticed is that emo::ji() throws an error when you feed it a string that doesn’t match any emoji. I wonder if that’s because the package is in early development, or if that’s intentional? (Or more likely I’m doing something wrong 😁!) Anyway, because most of the words in the lyrics don’t match an emoji I’ll write a quick function that substitutes an emoji if the string matches, but passes it through untouched if there is no match:

emojify <- function(w) {
  out <- try(emo::ji(w), silent=TRUE)
  if(class(out) == "try-error") {
    out <- w
  }
  return(out)
}

Now a generalisation of this, to emojify individual words in a string. For simplicity I’m going to ignore punctuation and treat " " as the only separating character. The function will also only consider a single string not a vector or list of strings:

emojify_words <- function(str) {
  w_vec <- strsplit(str," ")[[1]]
  w_vec <- sapply(w_vec, emojify)
  w_vec <- paste(w_vec, collapse = " ")
  return(w_vec)
}

So now we apply this function to every line in the song:

emolyrics <- sapply(lyrics, emojify_words)
cat(emolyrics, sep = "\n")
## [Intro]
## Ba-ba-ba, ba-ba, ba-ba-ba-ba
## Ba-ba-ba, ba-ba, ba-ba-ba-ba
## Ba-ba-ba, ba-ba, ba-ba-ba-ba
## Ba-ba-ba, ba-ba, ba-ba-ba-ba
## 
## [Verse 1: Ron]
## The guy who slagged the 🏈 team, those yobs were 🔕 for him
## He turns into 🅰️ real estate agent who believes in discipline
## The guy who's 🥇 to use cocaine, the wild 👦 breaking 🆓
## He'll 🔚 🆙 in 🅰️ court of 👮 as 🅰️ prosecuting QC
## Remember the 🏫 captain? Success was 🅰️ matter of ⏳
## I 🥫 👂 him now as she screams, "Greg, you missed the 🙅‍♂ sign!"
## 
## [Verse 2: Ron]
## Forget Snoop Doggy Dogg, forget 👵 Ice T
## The true word out 🔛 the streets is produced by the TAC
## What's the use of striving? As life's 🛣 in front unravels
## We get to do the driving, don't choose the ⬅️ we 🗺
## Do your homework or wag for weeks, graffiti the Dandenong line
## It don't matter much when you 🙉 that scream, "Greg, you missed the 🚏 sign!"
## 
## [Chorus]
## Greg! The 🚏 sign!!
## Greg! The ✋ sign!!
## Greg! The 🛑 sign!!
## Greg! The ✋ sign!!
## 
## [Verse 3: Ron]
## Sometime in the next 10,000 years 🅰️ comet's gonna wipe out all trace of 👨
## I'm banking 🔛 🇮🇹 coming before my 🔚 of year exams
## The 🤑 kid becomes 🅰️ junkie, the poor kid an advertiser
## What 🅰️ tragic waste of potential – being 🅰️ junkie's 🚷 so 🦸 either
## Your folks worked hard for what you've got, you are the 🍊 of their vine
## But who cares what you 🐖 when what you reap is, "Greg, you missed the ⏹ sign!"
## 
## [Verse 4: Humphrey]
## Bought 🅰️ 🚗 just the other day, man, could that 👶 run
## But you know what they always say, there's always 🅰️ better 1️⃣
## Got 🅰️ tumour in my brain, it's creeping to my lungs
## And I've searched around in vain, can't find me 🅰️ better 1️⃣
## 
## [Verse 5: Ron]
## Hardwired into everyone's 🗣 is the person they're gonna be
## Growing up's 🙅 🅰️ matter of choice, it's 🅰️ matter of wait and 👀
## So kids, yeah, you 🥫 do it, you 🥫 be your best
## Girls 🥫 do anything, you 🥫 pass the test
## I'm okay, you're okay, we're okay, we're fine
## I 💭 I heard 🅰️ semi-trailer, "Greg, you missed the 🙅 sign!"
## 
## [Chorus]
## Greg! The 🚏 sign!!
## Greg! The ⏹ sign!!
## Greg! The 🙅 sign!!
## Greg! The 🙅‍♂ sign!!
## Greg! The 🚏 sign!!
## Greg! The ✋ sign!!
## Greg! The 🚏 sign!!
## Greg! The ⏹ siiiiiiggggggn!!
Avatar
Danielle Navarro
Associate Professor of Cognitive Science

Related