This is an archived copy of a previous semester's site.
Please see the current semester's site.
This page is intended to be a quick reference to C assuming you are already familiar with C++.
C is (mostly) a subset of C++. There’s (almost) no new language to learn, but there are some things to unlearn and some simpler replacements for C++’s more complicated components.
No templates or namespaces.. If the C++ includes a template (e.g. x<y>
, it’s not permitted in C. If it contains a namespace (e.g. std::y
or using std
) it’s also not permitted in C.
No classes. The class
and this
keywords1 Other class-specific keywords are also not in C++, like friend
, public
, virtual
, and so on. are C++ additions not present in C. A struct
can store data like a class, but can’t have member methods. Object orientation can still be implemented using pointers to structs and member function pointers, but this is rarely done in practice.
No function overloading. Each function name can have one and only on definition and signature. If you have a foo(int)
you can’t also have a foo(double)
or foo(int, int)
. To help with this, C functions often have suffixes on their names that indicate their argument types, like strtol
, and strtoll
which differ in having an argument of type long
or long long
.
As a seeming exception to this rule, C make more use of variadic functions than C++. These have some number of fixed arguments followed by any number of additional arguments of any type. The fixed argument values are used to dynamically determine the number and type of the remaining arguments. printf
is by far the best-known example of this type of function.
No new
or delete
operators. Instead, memory is allocated with the malloc
, calloc
, and realloc
functions and deallocated with the free
and realloc
functions. These all handle untyped memory as void *
s and need to be told the size of the data in bytes, meaning C code typically has many more sizeof(type)
operators than would the corresponding C++ code.
No pass-by-reference. Equivalent semantics can be created by passing pointers and dereferencing them during use at the cost of slightly more verbose code.
No operator overloading. <<
only means left-shift, never output. +
only means addition, never string concatenation. And so on.
Different common library code. Common C++ library functions and types like string
, cout
, cin
, vector
, and map
do not exist in C. Strings are stored as char *
s; I/O is handled with FILE *
and functions like fopen
, printf
, and fread
; and if you want a data structure you’ll generally have to program it yourself.
The following code snippets are intended to demonstrate some patterns common to C coding.
#include <stdio.h> // has printf, fprintf, stderr
#include <stdlib.h> // has strtol
int main(int argc, char *argv[]) {
if (argc != 2) { // want 1 and only 1 argument
// fprintf = formatted print to FILE*
// FILE *stderr is defined in stdio.h
// %s means "replace with a string argument after this"
(stderr, "USAGE: %s num\n", argv[0]);
fprintf
return 1; // != 0 means failure to operate
}
// strtol = string to long
// NULL means ignore what comes after the integer, if anything
// 10 means base-10 (decimal) parsing
long end = strtol(argv[1], NULL, 10);
if (end < 1) {
(stderr, "USAGE: %s num\n", argv[0]);
fprintf(stderr, " num must be a positive integer, not \"%s\"\n", argv[1]);
fprintfreturn 1;
}
for (long i=1; i<=end; i+=1) {
// printf = formatted print to stdout
// %ld = replace with a long argument formatted as a base-10 (decimal) integer
("%ld\n", i);
printf}
return 0; // == 0 means success
}
int
sSeveral peculiarities with struct
definitions like this deserve additional note:
struct { ... }
is the name of a type, just like int
or double
are. But we rarely use this form.struct name { ... }
names the type struct name
(not just name
) and is the only way to have a type that contains a pointer to itself. It is common to add a _t
or _s
after such a name to make it stand out more.typedef type name;
names the type name
.typedef struct name1 { ... } name2;
makes two names for the same type: struct name1
and name2
. name2
is more convenient, but doesn’t exist until after the ;
so inside the structure definition we use struct name1
instead.The list is comprised of several nodes, each pointing to the next. We reference the list using a pointer to the first node in the list. Because the functions can modify the list, we pass in a pointer to that pointer; that way the functions can modify *list
to change the pointer, and thus which node is first in the list.
#include <stdio.h> // for printf
#include <stdlib.h> // for strtol, malloc, free
#include <string.h> // for strlen
typedef struct list_node_t {
int val;
struct list_node_t *next;
} list_node;
/**
* Given a list and a number, pushes the number onto the head of the list.
*
* @param list A pointer to the list; the list is a pointer to a node.
* Will be modified to point to the new head of the list.
* @param n A number to push onto the list.
*/
void list_push(list_node **list, int n) {
// allocate memory for a new node
*node = malloc(sizeof(list_node));
list_node // initialize its contents
->val = n;
node->next = *list;
node// and make it the new head of the list
*list = node;
}
/**
* Given a list, pops off the head of the list.
*
* @param list A pointer to the list; the list is a pointer to a node.
* Will be modified to point to the new head of the list.
* @return The value previously in the head of the list -- i.e.
* what (*list)->val was before this function was called.
*/
int list_pop(list_node **list) {
// copy the head of the list into a variable
*head = *list;
list_node // change the list to start after the head
*list = (*list)->next;
// copy the value
int result = head->val;
// and deallocate the detached head node's memory
(head);
free// then return the value
return result;
}
// Demo: print the length of each command-line argument in reverse order
int main(int argc, char *argv[]) {
*list = NULL; // make an empty list
list_node for (int i=0; i<argc; i+=1) {
// push the length of each argument
(&list, strlen(argv[i]));
list_push}
while (list) { // repeat while list != NULL
// pop a value and print it
// list_pop also updates what `list` points to
("arg length: %d\n", list_pop(&list));
printf}
return 0;
}
C has the following control constructs:
goto
and labelsif
/else
switch
/case
/default
/break
do
/while
while
for
– for(initializer; condition; update)
break
and continue
C has the following datatypes:
signed integers char
(8 bits), short
(16 bits), int
(32 bits), long
(usually 64 bits), long long
(64 bits).
unsigned
to have an unsigned integer type.#include <stdint.h>
there are more reasonable names: int
N_t
for signed integers with N bits and uint
N_t
for unsigned integers with N bits.#include <stddef.h>
(and many other headers too) there are two size-of-a-pointerinteger types,
size_t
(unsigned) and ptrdiff_t
(signed). On most 2020s-era computers these are 64 bits, but on older or embedded systems they may be 32 bits instead.floating-point numbers float
(32 bits), double
(64 bits), and long double
(64 or more bits).
pointers, usually with the type pointed to indicated (e.g. int *
, double *
, etc) but the special pointer type void *
does not indicate what type it is pointing to and cannot be derefered without first converting it to a different type.
function pointers, which are technically a type of pointer but have a very different2 In my opinion, a very confusing syntax too. syntax: returntype (*variablename)(arg1type, arg2type)
.
The following code uses function pointers to apply several operations to a list.
int add(int a, int b) { return a+b; }
int sub(int a, int b) { return a-b; }
int mul(int a, int b) { return a*b; }
int (*ops[])(int,int) = {add, sub, mul};
int nums[4] = {124, 128, 225, 340};
int main(int argc, char *argv[]) {
for(int i=1; i<4; i+=1) {
[i] = ops[i-1](nums[i-1], nums[i]);
nums}
// nums is {124, 124+128, (124+128)-225, ((124+128)-225)*340}
("%d %d %d %d\n", nums[0], nums[1], nums[2], nums[3]);
printf}
arrays, which are several values of the same type contiguous in memory.
structs, which are several values of various types contiguous in memory.
unions, which are several values of various types all overlapping in the same memory.
The typedef
keyword gives a new name to an existing data type and is used extensively in C code to create everything from renamed integer types like size_t
and uint8_t
to renamed structs with other renamed components inside.
Notably absent in the above list:
Boolean values can be any type; all-zero-bits means false, anything else means true. Boolean operators like !
create int
results with 0 for false and 1 for true.
Classes, namespaces, iterator loops, ans templates types do not exist in C.
Reference types (which C++ allows for function parameters only) do not exist in C.
Integers can be expressed in decimal, 340
; hexadecimal, 0x145
; octal, 0524
; or binary, 0b101010100
.
Integers can also be expressed in UTF-8; for example ☺ is U+263A and is encoded in UTF-8 as E2 98 BA so '☺'
means the same thing as 0xE298BA
.
Floats can be expressed in decimal, 340.0
; exponential form with a base-10 exponent, 3.4e2
; or hexadecimal exponential form with a base-2 exponent, 0x1.45p+8
.
Arrays can be expressed by placing values inside braces, int x[4] = {124, 128, 225, 340}
. This notation does not supply enough information to unambiguously identify the data type, so it is only permitted as part of assignment to an appropriately-typed variable.
Structures can be expressed by placing values inside braces, struct {int first,next,then,now;} x = {124, 128, 225, 340}
. This notation does not supply enough information to unambiguously identify the data type, so it is only permitted as part of assignment to an appropriately-typed variable. They can also have field names and be placed in any order, struct {int first,next,then,now;} x = {.now=340, .first=124}
Double-quoted string literals are pointers to arrays stored in read-only global memory (i.e. memory the operating system will not let us modify once the program begins), where the contents of those arrays are the UTF-8 encoding of the string contents (after resolving any \-escapes like \n or \u0145), with one extra byte set to 0 at the end of the array.
In the following code, each x
, y
, and z
are pointers to arrays containing the same byte sequence, but x
and y
point to arrays in modifiable global memory while z
points to an array in read-only global memory.
char x_data[8] = {'S', 'y', 's', 't', 'e', 'm', 's', 0};
char y_data[8] = {83, 121, 115, 101, 109, 115, 0};
char *x = &x_data;
char *y = &y_data;
char *z = "Systems";
Arithmetic: +
, -
, *
, /
, %
Bitwise: >>
, <<
, |
, &
, ^
, ~
Logical: &&
, ||
, !
Comparison: <
, <=
, ==
, !=
, >=
, >
Address-of: &
if you use an array where a pointer is expected, a &
is inserted automatically by the compiler.
Dereference: *
, []
, ->
a[i]
and *(a+i)
are synonyms, as are (*a).
and a->
Member of: .
, ->
(*a).
and a->
are synonyms.
Assignment: =
Update: ++
, --
, and op=
for all arithmetic and bitwise operators op
Prefix ++x
can be marginally faster than postfix x++
in some situations; likewise for --x
and x--
.
x+=1
and ++x
are synonyms, as are x-=1
and --x
.
Cast: (
type)
C inserts implicit casts in many places.
Selection: ?:
a?x:y
evaluates to x
if a
is true, otherwise it evalautes to y
.
Function call: ()
Type inspection: sizeof
, alignof
Sequence: ,
x,y
has the side effects of x
(if any), then evaluate to y
.