3 Organizing an R package

3.1 Converting mp3s to wavs

Now that the mp3 files were downloaded into my computer, I had to convert them to .wav files so that they worked with the audio R package. I used ffmpeg to do this. It’s easiest to do this by downloading ffmepg from the website and running a command from the terminal, but we can wrap them in a system() call like such if we again insist on doing everything from within R. The text inside the system call loops over all the .mp3 files in the /mp3 folder and converts them to .wav, keeping the rest of the file name the same. I then moved them to their own folder and deleted the mp3s - we won’t need those anymore.

Again, command line syntax is different in Windows. Them’s the breaks. ¯\_(ツ)_/¯

# convert mp3s to wav files 
system('for file in mp3s/*.mp3; do
   ffmpeg -i "$file" -acodec pcm_s16le -ac 1 -ar 44100 "${file%.mp3}".wav
done') 

# make a new folder for the wav files 
dir.create('wav') 

# move wavs to the wav folder 
system("mv mp3s/*.wav wav") 

# delete the mp3s 
unlink("mp3s", recursive = TRUE, force = TRUE)

3.2 Cleaning up and filtering .wav file names

The audio files from The Rap Board don’t have much of a consistent structure for unique IDs. Some are numbered, while some include segments of the lyric. The numbered files don’t always fall in order. This is more than fine for them, but since we’ll often be calling particular sounds from inside the function by name, I don’t particularly want to have to remember whether Gucci yelling “BRRR” is called “gucci_brr” or “gucci_brrr” or, inexplicably, “gucci_14”, as it was when we downloaded it.

I was doing a lot of str_splitting, so I wrote a convenience function to extract the first component from the rest of the list.

extract_first = function(string, pattern) {

  x = stringr::str_split(string, pattern) 
  y = purrr::map_chr(seq_along(string), ~x[[.x]][1])
  
}
# create a lookup table matching the artist to the unique .wav file 
wavs = list.files("wav")
wav_names = map_chr(wavs, str_replace , ".wav", "")
artist = extract_first(wav_names, "_")  
lookup_table = data_frame(wav_names, artist)

We’re in a good shape - we now have a dataframe with the names of 319 wav files. That is way too many to include in a package. At this point, I went through all of them and chose my favorites, based on my personal preferences. This part is manual, arbitrary, and important.

selected = lookup_table %>% 
  filter(wav_names %in% c("2chainz_tru", "2chainz_whistle", "bigboi_1", "biggie_2", "bigsean_boi2", "bigsean_doit", "bigsean_holdup2", "bigsean_ohgod", "bigsean_stop", "bigsean_whoathere", "birdman_1", "birdman_4", "birdman_respeck", "busta_6", "chance_aghh2", "desiigner_rahhh", "diddy_5", "drake_5", "drake_worst", "drummaboy_1", "fetty_yeahbaby", "flava_1", "future_brrr", "gucci_1", "gucci_14", "gucci_4", "jayz_itsyoboy", "jayz5", "kendrick_tootoo", "khaled_blessup2", "khaled_majorkey3", "khaled_theydontwant", "khaled_wethebest", "liljon_2", "liljon_3", "nicki_laugh2", "pitbull_6", "ross_1", "ross_2", "schoolboy_yawk", "snoop_4", "soulja_5", "takeoff_money", "tpain1", "traviscott_straightup", "treysongz_uhunh", "trick_2", "waka_1", "weezy_4", "yg_skrrt"))

3.3 Tidying file names

Luckily, the files tend to fall under a general artist_uniqueid naming convention. The next section cleans up the unique IDs. If a rapper has any sound board sounds, you’ll be able to call it with skrrrahh("name"). To cycle through the various sounds, use skrrrahh("name1"), skrrrahh("name2"), etc. until you get an error.

# make the filenames more consistent 
filtered_names = selected %>% 
  group_by(artist) %>% 
  mutate(n = row_number()-1) %>% 
  mutate(newnames = paste0(artist, n)) 

# remove the "0s" so that you can call some files just by the artist name
filtered_names$newnames = map(filtered_names$newnames, str_replace, 
                              pattern = "0", replacement = "") %>% 
  unlist()

# two are stilled a mess - let's fix these manually. 
filtered_names$newnames = str_replace(filtered_names$newnames, "jayz5", "jayz1")
filtered_names$newnames = str_replace(filtered_names$newnames, "tpain1", "tpain")

Finally, I couldn’t make this package without including Big Shaq. He’s not on the Rap Board yet, so I made his clip in garageband and manually dragged it into inst/adlibs. That means this walkthrough is not entirely reproducible, but as Ralph Waldo Emerson says, _“a foolish consistency is the hobgoblin of little minds”, so Please Do Not @ Me.


bigshaqdf = data.frame(wav_names = "bigshaq", artist = "bigshaq", n = 0, newnames = "bigshaq")
filtered_names = bind_rows(filtered_names, bigshaqdf) %>% 
  arrange(newnames)

Let’s take a look at the table we used to transfer the old names to the new ones:

knitr::kable(head(filtered_names))
wav_names artist n newnames
2chainz_tru 2chainz 0 2chainz
2chainz_whistle 2chainz 1 2chainz1
bigboi_1 bigboi 0 bigboi
biggie_2 biggie 0 biggie
bigsean_boi2 bigsean 0 bigsean
bigsean_doit bigsean 1 bigsean1

3.4 Renaming file paths from within R

Now I have to use the information in the data frame to rename the actual files. The easiest way for me to do that is to rename them while moving them into a new directory. I can then delete the entire old directory.

Conveniently, R packages store all files that they need in the inst/ folder, so I have to get these babies there at some point. Let’s do it now.

# create a new directory, inst/adlibs 
dir.create("inst")
dir.create("inst/adlibs")

# make character vectors that map the old file paths to the new file paths 
filtered_names <- filtered_names %>% 
  mutate(old_filepaths = paste0("wav/", wav_names, ".wav"),
         new_filepaths = paste0("inst/adlibs/", newnames, ".wav"))

# rename the old paths to the new paths 
map2(filtered_names$old_filepaths, filtered_names$new_filepaths, file.rename)

# delete the old file path 
unlink("wav", recursive = TRUE, force = TRUE)