C++: Hungarian Notation

Hungarian Notation is a naming convention in which the type and/or scope of a variable is used as a naming prefix for that variable. For example:

int value; // non-Hungarian

int nValue; // the n prefix denotes an integer

double width; // non-Hungarian

double dWidth; // the d prefix denotes a double

Hungarian Notation was invented in 1972 by Charles Simonyi, a Microsoft programmer. The original idea of Hungarian Notation was to encode information about the variable’s purpose, which is known as Apps Hungarian. Over time, this focus changed to encoding information about the variable’s type and/or scope, which is known as Systems Hungarian.

There is a lot of controversy over whether Hungarian Notation is useful in modern programming languages and with modern IDEs. We believe the advantages still outweigh the disadvantages, though you will find plenty of programmers who disagree.

One advantage of Hungarian Notation is that the variable type can be determined from it’s name. Many argue that this is an obsolete advantage, because most modern IDEs will tell you a variables type if you mouse-hover over the name. However, consider the following snippet:

float applesPerPerson = totalApples / totalPersons;

Casually browsing the code, this statement would probably not attract notice. But there is a good chance it’s wrong. If totalApples and totalPersons are both integers, the compiler will evaluate totalApples / totalPersons using integer division, causing any fractions to be lost before the value is assigned to applesPerPerson. Thus, if totalApples = 5, and totalPersons = 3, applesPerPerson will be assigned 1 instead of the expected 1.66!

However, if we use Hungarian Notation variable names:

float fApplesPerPerson = nTotalApples / nTotalPersons;

The n prefixes make it clear from just browsing the code that this is an integer division that’s going to cause us problems! Furthermore, as you code, the n prefix will remind you to watch out for integer division and overflow issues every time you use an integer variable in an expression or statement.

Another advantage of Hungarian Notation is that it gives us a way to name variables using shorthand. For example, bError is understood to mean isError, and nApples is a shorthand way of writing numberOfApples.

One perceived disadvantage of Hungarian Notation is that it leads to extra work when a variable’s type changes. For example, it is common to declare an integer variable and then later change it to a double variable because you need to deal with fractional values. Without using Hungarian Notation, you could change int value to double value and go on your merry way. However, in Hungarian Notation, you’d not only have to change the declaration int nValue to double dValue, you’d have to change every use of nValue in your entire program to dValue! If you do not, your naming scheme will be misleading and inconsistent.

While replacing a potentially huge number of variable names is certainly a nuisance, we believe it is also a good thing. Because different types have different behaviors, having to explicitly replace your variable names encourages you to examine your code to ensure you’re not doing anything dangerous with the new type.

For example, without Hungarian Notation, you might have written

if (value == 0)

    // do something

When value is changed from an int to a double, your safe integer comparison is now an unsafe floating point comparison that may produce unexpected results! In the best case, this error shows up when testing your program, and you have to spend time debugging it. In the worst case, the bug ships out and you end up with millions of customers have software that doesn’t work right!

However, if you’d used Hungarian Notation and written:

if (nValue == 0)

    // do something

And were forced to change it to:

if (dValue == 0.0)

    // do something

Hopefully at this point you’d say, “Hey, wait a second, I shouldn’t be doing naked comparisons with floating point values!”. Then you could modify it to something more appropriate and move on. In the long run, this can actually save you lots of time.

A real disadvantage of traditional Hungarian Notation is that the number of prefixes for compound types can become confusing.Wikipedia provides an appropriate example: “a_crszkvc30LastNameCol : a constant reference function argument, holding contents of a database column of type varchar(30) called LastName that was part of the table’s primary key”. a_crszkvc is non-trivial to decipher, and makes your code less clear.

As an aside, Hungarian Notation got it’s name from prefixes such as a_crszkvc that look like they’re written in Hungarian!

Caste Hungarian

Different programmers and/or companies tend to use different varieties of Systems Hungarian of varying complexity. Although most of them have some commonality (like using a d prefix for double, and an n (or i) prefix for integers), there is a lot of variation as to which types get what prefixes, and how those prefixes should combine.

We believe that using a different prefix for each data type is overkill, especially in the case of structs and classes, which can be user defined to a high degree. Furthermore, long Hungarian looking prefixes obscure code clarity more than they help it. Consequently we advocate a simplified version of Systems Hungarian called “Caste Hungarian”. In Caste Hungarian, Hungarian Notation is used mostly to denote which “caste” of data type a variable falls into (integers, floating points, classes, etc…).

Variable prefixes are composed of 3 parts: a scope modifier, a type modifier, and a type prefix (in that order). Scope modifier and type modifier may not apply. Consequently, the overall prefix length is kept reasonable, with the average prefix length being around 2 letters. This system conveys most of the advantages of Hungarian Notation without many of it’s disadvantages, and it keeps the entire system simple and easy to use.

The type prefix indicates the data type of the variable.

Type prefix	Meaning	Example
b	boolean	bool bHasEffect;
c (or none*)	class	Creature cMonster;
ch	char (used as a char)	char chLetterGrade;
d	double, long double	double dPi;
e	enum	Color eColor;
f	float	float fPercent;
n	short, int, long char used as an integer	int nValue;
s	struct	Rectangle sRect;
str	C++ string	std::string strName;
sz	Null-terminated string	char szName[20];

The following type modifiers are placed before the prefix if they apply:

Type modifier	Meaning	Example
a	array on stack	int anValue[10];
p	pointer	int* pnValue;
pa	dynamic array	int* panValue = new int[10];
r	reference	int rnValue;
u	unsigned	unsigned int unValue;

The following scope modifiers are placed before the type modifier if they apply:

Scope modifier	Meaning	Example
g_	global variable	int g_nGlobalValue;
m_	member of class	int m_nMemberValue;
s_	static member of class	int s_nValue;

A few notes:

This list is not exhaustive. It is meant to cover the most common cases. If you feel a variable of a different type deserves it’s own prefix, give it one!
Use meaningful variable names and suffixes to clarify your variables. This is especially important with struct and class variables. For example, a Rectangle struct variable holding the position and size of a window is better declared as Rectangle sWindowRect; than Rectangle sWindow;
Char has a different prefix depending on whether it’s being used as an ASCII character or integer. This helps clarify it’s intended use and prevent mistakes.
Float has a different prefix than double because floating point literals are doubles by default. Float literals need a f suffix.
Typedefs don’t fall very well into this system.
The ‘c’ prefix for a class can be omitted if the variable is a pointer or a reference to a class.
Because integer types are not differentiated, you can easily change to a larger or smaller integer as needed without changing the variable name. However, changing to a smaller integer is generally not recommended due to potential overflow issues.

Here are a few sample declarations:

int nIndex; // simple integer type prefix

int* pnIndex;  // a pointer to an integer

int m_nIndex; // an integer variable that is a member of a class

int* m_pnIndex; // an pointer to an integer variable that is a member of a class

C++

Hungarian Notation

Share

0 comments:

Search This Blog

Welcome

Tags

Popular Posts