C (programming language)

C (pronounced /ˈsiː/ – like the letter c)^[6] is a general-purpose computer programming language. It was created in the 1970s by Dennis Ritchie, and remains very widely used and influential. By design, C's features cleanly reflect the capabilities of the targeted CPUs. It has found lasting use in operating systems, device drivers, and protocol stacks, but its use in application software has been decreasing.^[7] C is commonly used on computer architectures that range from the largest supercomputers to the smallest microcontrollers and embedded systems.

Not to be confused with C++ or C#.

Paradigm

Multi-paradigm: imperative (procedural), structured

Dennis Ritchie

ANSI X3J11 (ANSI C); ISO/IEC JTC 1 (Joint Technical Committee 1) / SC 22 (Subcommittee 22) / WG 14 (Working Group 14) (ISO C)

1972 (1972)^[2]

C17 / June 2018

C23 (N3096) / April 2, 2023^[3]

Static, weak, manifest, nominal

Cross-platform

.c, .h

www.iso.org/standard/74528.html
www.open-std.org/jtc1/sc22/wg14/

A successor to the programming language B, C was originally developed at Bell Labs by Ritchie between 1972 and 1973 to construct utilities running on Unix. It was applied to re-implementing the kernel of the Unix operating system.^[8] During the 1980s, C gradually gained popularity. It has become one of the most widely used programming languages,^[9]^[10] with C compilers available for practically all modern computer architectures and operating systems. The book The C Programming Language, co-authored by the original language designer, served for many years as the de facto standard for the language.^[11]^[1] C has been standardized since 1989 by the American National Standards Institute (ANSI) and the International Organization for Standardization (ISO).

C is an imperative procedural language, supporting structured programming, lexical variable scope, and recursion, with a static type system. It was designed to be compiled to provide low-level access to memory and language constructs that map efficiently to machine instructions, all with minimal runtime support. Despite its low-level capabilities, the language was designed to encourage cross-platform programming. A standards-compliant C program written with portability in mind can be compiled for a wide variety of computer platforms and operating systems with few changes to its source code.

Since 2000, C has consistently ranked among the top two languages in the TIOBE index, a measure of the popularity of programming languages.^[12]

The language has a small, fixed number of keywords, including a full set of primitives: if/else, for, do/while, while, and switch. User-defined names are not distinguished from keywords by any kind of sigil.

control flow

It has a large number of arithmetic, , and logic operators: +,+=,++,&,||, etc.

bitwise

More than one may be performed in a single statement.

assignment

run-time polymorphism

Data typing is , but weakly enforced; all data has a type, but implicit conversions are possible.

static

struct

Low-level access to is possible by converting machine addresses to pointers.

computer memory

(subroutines not returning values) are a special case of function, with an empty return type void.

Procedures

Memory can be to a program with calls to library routines.

allocated

A performs macro definition, source code file inclusion, and conditional compilation.

preprocessor

There is a basic form of : files can be compiled separately and linked together, with control over which functions and data objects are visible to other files via static and extern attributes.

modularity

Complex functionality such as , string manipulation, and mathematical functions are consistently delegated to library routines.

I/O

The generated code after compilation has relatively straightforward needs on the underlying platform, which makes it suitable for creating operating systems and for use in embedded systems.

Standard I/O library

data type

long int

unsigned int data type

Compound assignment operators of the form =op (such as =-) were changed to the form op= (that is, -=) to remove the semantic ambiguity created by constructs such as i=-10, which had been interpreted as i =- 10 (decrement i by 10) instead of the possibly intended i = -10 (let i be −10).

Lowercase and uppercase letters of ISO Basic Latin Alphabet: a–z A–Z

Decimal digits: 0–9

Graphic characters: ! " # % & ' ( ) * + , - . / : ; < = > ? [ \ ] ^ _ { | } ~

: space, horizontal tab, vertical tab, form feed, newline

Whitespace characters

: space for the object is provided in the binary at compile-time; these objects have an extent (or lifetime) as long as the binary which contains them is loaded into memory.

Static memory allocation

: temporary objects can be stored on the stack, and this space is automatically freed and reusable after the block in which they are declared is exited.

Automatic memory allocation

: blocks of memory of arbitrary size can be requested at run-time using library functions such as malloc from a region of memory called the heap; these blocks persist until subsequently freed for reuse by calling the library function realloc or free

Dynamic memory allocation

One of the most important functions of a programming language is to provide facilities for managing memory and the objects that are stored in memory. C provides three principal ways to allocate memory for objects:^[35]

These three approaches are appropriate in different situations and have various trade-offs. For example, static memory allocation has little allocation overhead, automatic allocation may involve slightly more overhead, and dynamic memory allocation can potentially have a great deal of overhead for both allocation and deallocation. The persistent nature of static objects is useful for maintaining state information across function calls, automatic allocation is easy to use but stack space is typically much more limited and transient than either static memory or heap space, and dynamic memory allocation allows convenient allocation of objects whose size is known only at run-time. Most C programs make extensive use of all three.

Where possible, automatic or static allocation is usually simplest because the storage is managed by the compiler, freeing the programmer of the potentially error-prone chore of manually allocating and releasing storage. However, many data structures can change in size at runtime, and since static allocations (and automatic allocations before C99) must have a fixed size at compile-time, there are many situations in which dynamic allocation is necessary.^[35] Prior to the C99 standard, variable-sized arrays were a common example of this. (See the article on malloc for an example of dynamically allocated arrays.) Unlike automatic allocation, which can fail at run time with uncontrolled consequences, the dynamic allocation functions return an indication (in the form of a null pointer value) when the required storage cannot be allocated. (Static allocation that is too large is usually detected by the linker or loader, before the program can even begin execution.)

Unless otherwise specified, static objects contain zero or null pointer values upon program startup. Automatically and dynamically allocated objects are initialized only if an initial value is explicitly specified; otherwise they initially have indeterminate values (typically, whatever bit pattern happens to be present in the storage, which might not even represent a valid value for that type). If the program attempts to access an uninitialized value, the results are undefined. Many modern compilers try to detect and warn about this problem, but both false positives and false negatives can occur.

Heap memory allocation has to be synchronized with its actual usage in any program to be reused as much as possible. For example, if the only pointer to a heap memory allocation goes out of scope or has its value overwritten before it is deallocated explicitly, then that memory cannot be recovered for later reuse and is essentially lost to the program, a phenomenon known as a memory leak. Conversely, it is possible for memory to be freed, but is referenced subsequently, leading to unpredictable results. Typically, the failure symptoms appear in a portion of the program unrelated to the code that causes the error, making it difficult to diagnose the failure. Such issues are ameliorated in languages with automatic garbage collection.

The C language permits platform hardware and memory to be accessed with pointers and , so system-specific features (e.g. Control/Status Registers, I/O registers) can be configured and used with code written in C – it allows fullest control of the platform it is running on.

type punning

The code generated after compilation does not demand many , and can be invoked from some boot code in a straightforward manner – it is simple to execute.

system features

The C language statements and expressions typically map well on to sequences of instructions for the target processor, and consequently there is a low demand on system resources – it is fast to execute.

run-time

With its rich set of operators, the C language can utilise many of the features of target CPUs. Where a particular CPU has more esoteric instructions, a language variant can be constructed with perhaps to exploit those instructions – it can use practically all the target CPU's features.

intrinsic functions

The language makes it easy to overlay structures onto blocks of binary data, allowing the data to be comprehended, navigated and modified – it can write data structures, even file systems.

The language supports a rich set of operators, including bit manipulation, for integer arithmetic and logic, and perhaps different sizes of floating point numbers – it can process appropriately-structured data effectively.

C is a fairly small language, with only a handful of statements, and without too many features that generate extensive target code – it is comprehensible.

C has direct control over memory allocation and deallocation, which gives reasonable efficiency and predictable timing to memory-handling operations, without any concerns for sporadic garbage collection events – it has predictable performance.

stop-the-world

C permits the use and implementation of different schemes, including a typical malloc and free; a more sophisticated mechanism with arenas; or a version for an OS kernel that may suit DMA, use within interrupt handlers, or integrated with the virtual memory system.

memory allocation

Depending on the linker and environment, C code can also call libraries written in , and may be called from assembly language – it interoperates well with other lower-level code.

assembly language

C and its and linker structures are commonly used in conjunction with other high-level languages, with calls both to C and from C supported – it interoperates well with other high-level code.

calling conventions

C has a very mature and broad ecosystem, including libraries, frameworks, open source compilers, debuggers and utilities, and is the de facto standard. It is likely the drivers already exist in C, or that there is a similar CPU architecture as a back-end of a C compiler, so there is reduced incentive to choose another language.

The standard dynamic memory handling with malloc and free is error prone. Bugs include: Memory leaks when memory is allocated but not freed; and access to previously freed memory.

The use of pointers and the direct manipulation of memory means corruption of memory is possible, perhaps due to programmer error, or insufficient checking of bad data.

There is some , but it does not apply to areas like variadic functions, and the type checking can be trivially or inadvertently circumvented. It is weakly typed.

type checking

Since the code generated by the compiler contains few checks itself, there is a burden on the programmer to consider all possible outcomes, to protect against buffer overruns, array bounds checking, , memory exhaustion, and consider race conditions, thread isolation, etc.

stack overflows

The use of pointers and the run-time manipulation of these means there may be two ways to access the same data (aliasing), which is not determinable at compile time. This means that some optimisations that may be available to other languages are not possible in C. FORTRAN is considered faster.

Some of the standard library functions, e.g. scanf or strncat, can lead to .

buffer overruns

There is limited standardisation in support for low-level variants in generated code, for example: different function and ABI; different structure packing conventions; different byte ordering within larger integers (including endianness). In many language implementations, some of these options may be handled with the preprocessor directive #pragma,^[55]^[56] and some with additional keywords e.g. use __cdecl calling convention. But the directive and options are not consistently supported.^[57]

calling conventions

using the standard library is code-intensive, with explicit memory management required.

String handling

The language does not directly support object orientation, , run-time expression evaluation, generics, etc.

introspection

There are few guards against inappropriate use of language features, which may lead to code. In particular, the C preprocessor can hide troubling effects such as double evaluation and worse.^[58] This facility for tricky code has been celebrated with competitions such as the International Obfuscated C Code Contest and the Underhanded C Contest.

unmaintainable

C lacks standard support for and only offers return codes for error checking. The setjmp and longjmp standard library functions have been used^[59] to implement a try-catch mechanism via macros.

exception handling

While C has been popular, influential and hugely successful, it has drawbacks, including:

For some purposes, restricted styles of C have been adopted, e.g. MISRA C or CERT C, in an attempt to reduce the opportunity for bugs. Databases such as CWE attempt to count the ways C etc. has vulnerabilities, along with recommendations for mitigation.

There are tools that can mitigate against some of the drawbacks. Contemporary C compilers include checks which may generate warnings to help identify many potential bugs.

Some of these drawbacks have prompted the construction of other languages.

Compatibility of C and C++

Comparison of Pascal and C

Comparison of programming languages

International Obfuscated C Code Contest

List of C-based programming languages

List of C compilers

Ritchie, Dennis M.

Ritchie, Dennis M. (1993). . The Second ACM SIGPLAN Conference on History of Programming Languages (HOPL-II). ACM. pp. 201–208. doi:10.1145/154766.155580. ISBN 0-89791-570-4. Retrieved November 4, 2014.

"The Development of the C Language"

; Ritchie, Dennis M. (1988). The C Programming Language (2nd ed.). Prentice Hall. ISBN 0-13-110362-8.

Kernighan, Brian W.

(1992). The Standard C Library (1 ed.). Prentice Hall. ISBN 978-0131315099. (source)

Plauger, P.J.

Banahan, M.; Brady, D.; Doran, M. (1991). The C Book: Featuring the ANSI C Standard (2 ed.). Addison-Wesley. 978-0201544336. (free)

ISBN

Harbison, Samuel; Steele, Guy Jr. (2002). C: A Reference Manual (5 ed.). Pearson. 978-0130895929. (archive)

ISBN

King, K.N. (2008). C Programming: A Modern Approach (2 ed.). W. W. Norton. 978-0393979503. (archive)

ISBN

Griffiths, David; Griffiths, Dawn (2012). Head First C (1 ed.). O'Reilly. 978-1449399917.

ISBN

Perry, Greg; Miller, Dean (2013). C Programming: Absolute Beginner's Guide (3 ed.). Que. 978-0789751980.

ISBN

Deitel, Paul; Deitel, Harvey (2015). C: How to Program (8 ed.). Pearson. 978-0133976892.

ISBN

Gustedt, Jens (2019). Modern C (2 ed.). Manning. 978-1617295812. (free)

ISBN

ISO C Working Group official website

ISO/IEC 9899

comp.lang.c Frequently Asked Questions

by Dennis Ritchie

C (programming language)

First appeared

Website

control flow

bitwise

assignment

run-time polymorphism

static

struct

computer memory

Procedures

allocated

preprocessor

modularity

I/O

Standard I/O library

long int

Whitespace characters

Static memory allocation

Automatic memory allocation

Dynamic memory allocation

type punning

system features

run-time

intrinsic functions

stop-the-world

memory allocation

assembly language

calling conventions

type checking

stack overflows

buffer overruns

calling conventions

String handling

introspection

unmaintainable

exception handling

Compatibility of C and C++

Comparison of Pascal and C

Comparison of programming languages

International Obfuscated C Code Contest

List of C-based programming languages

List of C compilers

Ritchie, Dennis M.

"The Development of the C Language"

Kernighan, Brian W.

Plauger, P.J.

ISBN

ISBN

ISBN

ISBN

ISBN

ISBN

ISBN

ISO/IEC 9899

comp.lang.c Frequently Asked Questions

A History of C

C Library Reference and Examples