Katana VentraIP

Magic number (programming)

In computer programming, a magic number is any of the following:

It is easier to read and understand. A programmer reading the first example might wonder, What does the number 52 mean here? Why 52? The programmer might infer the meaning after reading the code carefully, but it is not obvious. Magic numbers become particularly confusing when the same number is used for different purposes in one section of code.

[6]

It is easier to alter the value of the number, as it is not duplicated. Changing the value of a magic number is error-prone, because the same value is often used several times in different places within a program. Also, when two semantically distinct variables or numbers have the same value they may be accidentally both edited together.[6] To modify the first example to shuffle a Tarot deck, which has 78 cards, a programmer might naively replace every instance of 52 in the program with 78. This would cause two problems. First, it would miss the value 53 on the second line of the example, which would cause the algorithm to fail in a subtle way. Second, it would likely replace the characters "52" everywhere, regardless of whether they refer to the deck size or to something else entirely, such as the number of weeks in a Gregorian calendar year, or more insidiously, are part of a number like "1523", all of which would introduce bugs. By contrast, changing the value of the deckSize variable in the second example would be a simple, one-line change.

[6]

It encourages and facilitates documentation. The single place where the named variable is declared makes a good place to document what the value means and why it has the value it does. Having the same value in a plethora of places either leads to duplicate comments (and attendant problems when updating some but missing some) or leaves no one place where it's both natural for the author to explain the value and likely the reader shall look for an explanation.

[6]

The declarations of "magic number" variables are placed together, usually at the top of a function or file, facilitating their review and change.

[6]

It helps detect . Using a variable (instead of a literal) takes advantage of a compiler's checking. Accidentally typing "62" instead of "52" would go undetected, whereas typing "dekSize" instead of "deckSize" would result in the compiler's warning that dekSize is undeclared.

typos

It can reduce typing in some . If an IDE supports code completion, it will fill in most of the variable's name from the first few letters.

IDEs

It facilitates parameterization. For example, to generalize the above example into a procedure that shuffles a deck of any number of cards, it would be sufficient to turn deckSize into a parameter of that procedure, whereas the first example would require several changes.

Format indicators [edit]

Origin [edit]

Format indicators were first used in early Version 7 Unix source code.


Unix was ported to one of the first DEC PDP-11/20s, which did not have memory protection. So early versions of Unix used the relocatable memory reference model.[7] Pre-Sixth Edition Unix versions read an executable file into memory and jumped to the first low memory address of the program, relative address zero. With the development of paged versions of Unix, a header was created to describe the executable image components. Also, a branch instruction was inserted as the first word of the header to skip the header and start the program. In this way a program could be run in the older relocatable memory reference (regular) mode or in paged mode. As more executable formats were developed, new constants were added by incrementing the branch offset.[8]


In the Sixth Edition source code of the Unix program loader, the exec() function read the executable (binary) image from the file system. The first 8 bytes of the file was a header containing the sizes of the program (text) and initialized (global) data areas. Also, the first 16-bit word of the header was compared to two constants to determine if the executable image contained relocatable memory references (normal), the newly implemented paged read-only executable image, or the separated instruction and data paged image.[9] There was no mention of the dual role of the header constant, but the high order byte of the constant was, in fact, the operation code for the PDP-11 branch instruction (octal 000407 or hex 0107). Adding seven to the program counter showed that if this constant was executed, it would branch the Unix exec() service over the executable image eight byte header and start the program.


Since the Sixth and Seventh Editions of Unix employed paging code, the dual role of the header constant was hidden. That is, the exec() service read the executable file header (meta) data into a kernel space buffer, but read the executable image into user space, thereby not using the constant's branching feature. Magic number creation was implemented in the Unix linker and loader and magic number branching was probably still used in the suite of stand-alone diagnostic programs that came with the Sixth and Seventh Editions. Thus, the header constant did provide an illusion and met the criteria for magic.


In Version Seven Unix, the header constant was not tested directly, but assigned to a variable labeled ux_mag[10] and subsequently referred to as the magic number. Probably because of its uniqueness, the term magic number came to mean executable format type, then expanded to mean file system type, and expanded again to mean any type of file.

GUIDs [edit]

It is possible to create or alter globally unique identifiers (GUIDs) so that they are memorable, but this is highly discouraged as it compromises their strength as near-unique identifiers.[17][18] The specifications for generating GUIDs and UUIDs are quite complex, which is what leads to them being virtually unique, if properly implemented. .


Microsoft Windows product ID numbers for Microsoft Office products sometimes end with 0000-0000-0000000FF1CE ("OFFICE"), such as {90160000-008C-0000-0000-0000000FF1CE}, the product ID for the "Office 16 Click-to-Run Extensibility Component".


Java uses several GUIDs starting with CAFEEFAC.[19]


In the GUID Partition Table of the GPT partitioning scheme, BIOS Boot partitions use the special GUID {21686148-6449-6E6F-744E-656564454649}[20] which does not follow the GUID definition; instead, it is formed by using the ASCII codes for the string "Hah!IdontNeedEFI" partially in little endian order.[21]

They should not be useful; that is, most algorithms that operate on them should be expected to do something unusual. Numbers like zero don't fit this criterion.

They should be easily recognized by the programmer as invalid values in the debugger.

On machines that don't have , they should be odd numbers, so that dereferencing them as addresses causes an exception.

byte alignment

They should cause an exception, or perhaps even a debugger break, if executed as code.

Magic debug values are specific values written to memory during allocation or deallocation, so that it will later be possible to tell whether or not they have become corrupted, and to make it obvious when values taken from uninitialized memory are being used. Memory is usually viewed in hexadecimal, so memorable repeating or hexspeak values are common. Numerically odd values may be preferred so that processors without byte addressing will fault when attempting to use them as pointers (which must fall at even addresses). Values should be chosen that are away from likely addresses (the program code, static data, heap data, or the stack). Similarly, they may be chosen so that they are not valid codes in the instruction set for the given architecture.


Since it is very unlikely, although possible, that a 32-bit integer would take this specific value, the appearance of such a number in a debugger or memory dump most likely indicates an error such as a buffer overflow or an uninitialized variable.


Famous and common examples include:


Most of these are each 32 bits long – the word size of most 32-bit architecture computers.


The prevalence of these values in Microsoft technology is no coincidence; they are discussed in detail in Steve Maguire's book Writing Solid Code from Microsoft Press. He gives a variety of criteria for these values, such as:


Since they were often used to mark areas of memory that were essentially empty, some of these terms came to be used in phrases meaning "gone, aborted, flushed from memory"; e.g. "Your program is DEADBEEF".

Magic string

File format § Magic number

List of file signatures

FourCC

Hard coding

Magic (programming)

(Not a Number)

NaN

Enumerated type

for another list of magic values

Hexspeak

about magic constants in cryptographic algorithms

Nothing up my sleeve number

for problems that can be caused by magics

Time formatting and storage bugs

(aka flag value, trip value, rogue value, signal value, dummy data)

Sentinel value

special value to detect buffer overflows

Canary value

XYZZY (magic word)

an algorithm that uses the constant 0x5F3759DF

Fast inverse square root