What's in a Name? Anti-Patterns to a Hard Problem
Save article ToRead Archive Delete · Log in Log out
8 min read · View original · sitepoint.com
If you wish to make an apple pie from scratch, you must first invent the universe. —Carl Sagan
We name and name and name. And name. Naming is notoriously difficult, but it’s not like we’re starting from scratch every time. We have habits, conventions, and a personal style.
Often we don’t give a name much thought, and we still do a reasonable job of it. Of course, sometimes our first idea is terrible.
There is no formula for choosing a name. In some situations our habits are no good. Our strategies—unspoken or not—fall short. Naming is fraught with ambiguity. A good name answers important questions. What does it contain? Why does it exist? Why is it important? What does it mean? How would I use it? What role does it play? But it can hardly answer all the important questions at once. A bad name is confusing or unhelpful. It misinforms and misleads.
There are some common strategies that harm more than they help. Recognizing an anti-pattern makes it easier to choose a better strategy. A better strategy tends to lead to a better name.
An anti-pattern is a common response to a recurring problem that is usually ineffective and risks being highly counterproductive. —Wikipedia
As with all things naming, an anti-pattern isn’t always the wrong choice. The usual admonitions apply: “usually”, “probably”, “maybe”, “use your judgement”, etc., etc.
Underlying Types and Data Structures
If you see a name that encodes an underlying type, such as word_string
or new_hash
, there’s almost always a better name waiting in the wings.
Type information is just not that compelling. It doesn’t answer any of the important questions. In most situations it’s irrelevant. The type is an implementation detail, and implementation details can change without fundamentally changing the solution.
def anagrams(string, string_array)
string_array.each do |str|
str != string && same_alphagram?(string, str)
end
end
This code is simple. The names are correct but unhelpful.
One question you can ask yourself when faced with a bland collection of datatypes is:
What does it contain?
In the case of anagrams, it contains words.
def anagrams(word1, words)
words.each do |word2|
word1 != word2 && same_alphagram?(word1, word2)
end
end
Now we have a different problem. We’re wording the words so that we can word. There’s no meaningful distinction between words
, word1
, and word2
. We need to say something about how the words relate to each other in the context of detecting anagrams.
The original word or phrase is known as the subject of the anagram. —Wikipedia
So word1
is subject. The words
that we’re looping through may or may not be anagrams. They’re potential_anagrams
, but it’s a bit annoying to repeat anagram
in the name. Another word for a potential match is a candidate
.
def anagrams(subject, candidates)
candidates.each do |candidate|
subject != candidate && same_alphagram?(subject, candidate)
end
end
When computing Scrabble scores we run into the same thing.
def compute_score(chars)
chars.inject(0) {|num, char|
num + char_to_num[char]
}
end
Again, ask yourself what the variables contain, this time in the context of a game of Scrabble. The num is the thing we’re computing, the score. The char is a letter or tile. The hash of characters to numbers contains the point value for each tile.
def compute_score(tiles)
tiles.inject(0) {|score, tile|
score + points[tile]
}
end
Using the data type in the name is not always an anti-pattern.
When the scope is small it can be redundant to give a variable a more expressive name. The context already answers the important questions about it. There’s no reason to bloat the code with extra descriptions. Just use the type, such as s
for a string, or i
for an int.
Sometimes the name of the data structure helps clarify important details. A queue
is a concept that is familiar to programmers. The name jobs
might communicate your intent. But maybe not. If the FIFO (first in, first out) aspect is crucial, then job_queue
might be better. It expresses what the thing contains, as well as how to use it.
Structural
Another common strategy is to name things for their role in the program. It’s the input or the output. It’s the recurring phrase or the middle sentence. It’s a memo or sum or result.
Here’s some code that counts differences, a simplification of an algorithm known as the Hamming Distance.
def self.compute(first, second)
first.length.times.count { |i|
first[i] != second[i]
}
end
The algorithm is expressive enough, but the names first
and second
seem pretty arbitrary. They’re the first and second parameters, but does the order even matter? It’s unclear. And first
and second
what?
First and second DNA strand. Duh.
It turns out, order doesn’t matter. We only care how many mutations there are between two similar strands.
The name strand
answers the question of what it is. A simple suffix to differentiate between the two is enough. We don’t need to tell more of a story than that. We could use A
and B
, which don’t emphasize order quite as much as 1
and 2
.
def self.compute(strandA, strandB)
strandA.length.times.count { |i|
strandA[i] != strandB[i]
}
end
Here’s the Scrabble scoring method from earlier, with structural names.
def compute_score(input)
input.inject(0) {|sum, x|
sum + lookup[x]
}
end
The interesting thing about input isn’t that it happens to get passed to the method as an argument. The interesting bit is what it contains, which in this case is Scrabble tiles. Likewise, the sum
isn’t any old sum, it’s someone’s score. It’s an undeniable fact that we’re looking something up, but lookup
explains nothing essential. The question is what are you looking up? Points. There’s drama here if you look for it.
Idea Fragment
This is an alluring trap in Ruby, and once you see it you can’t unsee it. It’s everywhere.
The reason it’s so seductive is that it leads to many small methods.
“Wait, what?”
Yeah, sorry. It’s not that small methods are bad. It’s that everything is a trade-off. It turns out there are more important things than SLOC (source lines of code). Who knew?
Here’s a method from some code for scheduling meetups:
def prev_or_next_day(date, date_type)
date_type == :last ? date.prev_day : date.next_day
end
The name of the method repeats the conditional that it contains.
There’s no good name for this, because the method doesn’t isolate an entire idea. It takes a small sliver of an idea and sticks it in a method. When each method represents a fragment of a concept, the solution becomes incomprehensible. You can fit all the individual pieces in your head, but they don’t form a coherent picture.
The solution here is to inline it back to where it came from along with all the other shards of ideas that are in arbitrarily defined methods throughout the code. Then—once everything is in the same place—you’re more likely to find and name the whole idea.
Implementation Fragment
Sometimes the method isolates the complete thought, but the method name misses the mark.
Some code to generate the lyrics to The 99 Bottles of Beer song had this method in it.
def bottle_or_bottles(quantity)
if quantity == 1
"bottle"
else
"bottles"
end
end
(The above is taken from an upcoming book on using this song to study OOP. Full disclosure, I am one of the collaborators of the book.)
This, too, repeats the conditional. Bottle and bottles are two different instances of a single concept. Other fragments of that same concept might be “growler” or “keg” or “six-pack”.
def container(quantity)
if quantity == 1
"bottle"
else
"bottles"
end
end
A good name doesn’t join the implementation in the weeds. It lifts its eyes a bit and sees a bigger picture.
Here’s a method found in code to generate the lyrics to the song that goes “I know an old lady who swallowed a fly”.
def swallowed
"She swallowed the #{predator} to catch the #{prey}."
end
Predator and prey are great names. They explain what the variables contain, as well as how they’re related to each other. But swallowed
doesn’t help the reader much.
The author took a small piece of the implementation, and echoed it for the name.
A method should name an idea, not a random little piece of an idea. The song is about a little old lady who inexplicably swallows a fly. She then compounds the problem by swallowing larger and larger creatures. This method has isolated the part of the song that tries to explain why someone would do such a thing. It explains the reasoning behind her choices. Her motivation.
def motivation(predator, prey)
"She swallowed the #{predator} to catch the #{prey}."
end
Conclusion
Each example was problematic in a different way, but the strategy to fix them was similar. The first step was to describe the problem in English. The programming terms might end up being important later. For now, just find the words from that domain.
Scrabble has points and scores and tiles.
Anagrams are about words. But not just words. Words that relate to each other in a specific way. A subject and candidates.
More from this author
The Hamming distance between two DNA strands is not any old sum, it’s a count of mutations.
Any song can have a first line and a last line. Many songs will have recurring sentences. There’s a difference between a song about drinking beer and one about swallowing critters. Those differences matter.
Make meaningful distinctions. Remove gratuitous or unnecessary details.
In short, tell a good story.