Organic Nomenclature

This document contains a highly compressed, simplified version of the naming rules put out by the International Union on Pure and Applied Chemistry (IUPAC). That's right, I said simplified. The actual rules must cover perfectly accurately all 20 million or so compounds discovered to date, and all 10 million or so that we will discover over the next decade or so. They are mind-bogglingly complex (to steal a phrase from Douglas Adams). The rules here are intended to work well on simple compounds, and to give the introductory student a flavor for how the system works.

The naming of a compound follows several basic steps:

  • Identify the functional groups present and assign them priority. Identify the highest priority substituent.
  • Identify a parent portion of the molecule; name and number it.
  • Identify the substituents and locate them on the numbered parent
  • Assemble the name in proper order, and with proper punctuation, etc.

Identify Functional Groups and Assign Priority

The first step is to identify the functional groups present in the molecule. The following table has a large number of functional groups ranked in priority order. You should know that the highest priority group in your compound (towards the top of the table) is treated differently from the rest of the functional groups in the molecule. Note that some functional groups appear to have two or more carbon-containing groups; be sure you pay attention to which side of the functional group is the "main" side and which is the secondary side. For example, the "main" part of the ester group as shown below contains the R group; the secondary side contains the R' group. The main part of the functional group will be included in the parent.

Compound
or
Functional Group

Prefix

Primary Suffix

Secondary Suffix

IUPAC

Common

Structure

Carboxylic acid

 

 

-oic acid*

propanoic acid

propionic acid

Salt of acid

 

 

-oate*

sodium ethanoate

sodium acetate

Ester

Alkyl (R')

 

-oate*

methyl methanoate

methyl formate

Amide

 

 

-amide*

butanamide

butyramide

Nitrile

 

 

-onitrile*

ethanonitrile

acetonitrile

Aldehyde

Oxo

 

-al*

2-methylpropanal

isobutyraldehyde

Ketone

Oxo

 

-one

3-pentanone

diethyl ketone

Alcohol (-OH)

Hydroxy

 

-ol

2-butanol

sec-butyl alcohol

Amine (R-NH2)

Amino

 

-amine

2-methyl-propanamine

isobutylamine

Ether (R-O-R)

Alkoxy

 

 

ethoxyethane

diethyl ether

Arene (contains cyclic arrays of C=C)

 

benzene

 

1,2-dimethylbenzene

ortho-dimethylbenzene

Alkene (has C=C)

 

-en

e

2-methylpropene

isobutylene

Alkyne (has )

 

-yn

e

4,4-dimethyl-2-pentyne

tert-butyl-methylacetylene

Alkane (saturated hydrocarbon)

 

-an

e

2-methylpropane

isobutane

Fragments

Alkyl

-yl

 

 

 

methyl

ethenyl

methyl

vinyl

Acyl

 

 

-oyl

ethanoyl chloride

acetyl chloride

Halo (F, Cl, Br, I)

Halo

 

 

bromomethane

methyl bromide

Br-CH3

Parent

First, find the largest coherent component of the molecule that contains the highest priority functional group. Try to use common sense here. Because this is organic chemistry, you should restrict your search to the carbon skeleton only, and should not include any atoms other than carbon, except as included in the high priority functional group. You will need to determine whether it is acyclic, cyclic and non-aromatic or cyclic and aromatic. Acylic compounds have no rings, but they can have branches and lots of functional groups. Cyclic compounds have one or more rings, and may or may not be aromatic.

There are a couple of functional groups which may cause some confusion: esters, amides, ethers and amines. Each of these has a "heteroatom" (something other than the sacred carbon) which breaks the chain or ring. Remember that in all cases that you will see, the parent is named as an unbroken carbon chain or ring (in fact, some rings with oxygen or nitrogen in them are named as though the heteroatom was a carbon, and a "replacement" is made). Therefore, the parent of an ester, for example, is the carbon portion attached to (and including) the carbonyl. The "R" group attached to the oxygen is not part of the parent.

Parent Size

An important part of the name of a compound is a piece that indicates the number of carbons in the parent. This is present in a form of a code, with the key to the code here. In this table, you see the names for the alkane chains and also for the alkane chains as they would be if they were attached to something else as a substituent ("Alkyl").

Note that the naming shifts to a Greek-based numbering system after 4 carbons, and should be familiar to you.

The last column provides the number prefixes used when you describe many copies of simple substituents (like methyl groups). They don't strictly match the left column, rather than "#C's", you should read "# Groups" to match the number prefix.

#C's

Alkane

Alkyl Fragment

# Prefix

1

Methane

Methyl

[none]

2

Ethane

Ethyl

di

3

Propane

Propyl

tri

4

Butane

Butyl

tetra

5

Pentane

Pentyl

penta

6

Hexane

Hexyl

hexa

7

Heptane

Heptyl

hepta

8

Octane

Octyl

octa

9

Nonane

Nonyl

nona

10

Decane

Decyl

deca

Acyclic Compounds

Locating the Parent

The parent of an acyclic compound is found by locating the longest continuous chain that includes the highest priority functional group. If there is no such group, then the longest chain is used. If there's still a tie, the chain with the largest number of substituents will be the parent. The idea is to make the naming as easy as possible by including as many carbons as possible in the parent.

Numbering the Parent

Once you find the parent chain, it is numbered from the end closest to the highest priority functional group, or in the absence of such a group, from the end closest to the nearest branch point. The numbering always goes from low (1) to high. The location of substituents or multiple bonds is derived from this numbering. In that way, these numbers are just like street addresses, which tell you where on the street a house might be. If you look around, you will probably find that your street has some numbers for empty lots, some numbers for houses with one resident, and some for big families. Numbering the chain goes the same way: each carbon is numbered whether or not it has a substituent.

Naming the Parent

The name is assembled from several pieces: the main portion that gives the length of the parent chain, the primary suffix that gives the identity of any functional groups that are within the parent chain (alkenes or alkynes, for example) and the secondary suffix that gives the identity of the highest priority functional group. Primary and secondary suffixes generally have locator numbers (called "locants" in the jargon) to tell where in the chain they are.

Substituents are named as main chains are, except that the name ends in "yl." Each and every substituent is given a number, based on where it is attached to the parent. Every one of these numbers should appear in the name.

When there are multiple copies of one substituent type (a methyl, for example), all examples are aggregated into one word, using the number prefixes shown above. For example, two methyls would be named "dimethyl", and four ketone substituents would be "tetraoxo". There should be a locant for each group; hence there would be two locants in front of the dimethyl example, and four numbers in front of the tetraoxo example: 3,4,6,12-tetraoxo....

Check out the example, and see if you can figure out how it is named.

Dimethylheptane

Cyclic Non-Aromatic Compounds

Identifying the parent of all but the simplest rings is a complicated task. Let's look at the rules for a simple ring, and leave the rest to the experts. For all of the nomenclature that you do, the parent will be a single ring. This leaves out quite a bit of nomenclature, but it seems complicated enough already, eh?

Locating the Parent

The parent will be the largest ring in the compound, if the complexity of that ring is comparable to or greater than the substituent chains. What the heck does that mean? Fact is, it's pretty complicated and tedious to fully define it, so I will cheat: if there is a ring larger than a three-membered ring in the compound it will be the parent. If there are two or more rings, the largest or most complex (most substituents, for example) will be the parent.

Naming the Parent

The size of the ring, and the in chain functional groups are described by the naming as for acyclic compound, with only one naming difference: the addition of the fragment "cyclo". Hence, an six-membered-ring alkane is called cyclohexane, while an eight-membered-ring alkene would be called cyclooctene.

Ring Exception

What do you do with high-priority groups which are attached to a ring, but not part of it? If they are acids, esters, amides, nitriles or aldehydes, there is a "Ring Exception" for naming. Name the ring, then end it with one of the following:

Carboxylic Acid

-carboxylic acid

Ester

-carboxylate

Amide

-carboxyamide

Nitrile

-onitrile

Aldehyde

-carboxaldehyde

Numbering the Parent

Ah--here's the fun part. Rings have no beginning or end, leading to their use in symbolizing eternal marriage and other weighty topics. Therefore, we can't use the same strategy that we did with acyclic compounds.

For starting, the numbering may be easier. If there are substituents on the ring, simply start at the  highest priority substituent. From there, you can go in one of two directions. The rule for this is cumbersome, but reliable: number the substituents so that the sum of the locator numbers is the lowest. For example, the compound shown here is called 4-bromo-2-hydroxycyclopentanone, not 3-bromo-5-hydroxycyclopentanone.

Aromatic Compounds

 Locating the Parent

If you can't figure this out, you may wish to check out the definition of aromaticity, and look at some examples. Aromatic compounds are always rings, and we will limit ourselves to the 6-membered ring, benzene. There are lots of other aromatic rings, but the naming becomes complex for uncommon compounds, so we will skip that discussion. As discussed below, we will also restrict our discussion to single rings.

Naming the Parent

Welcome to the wonderful world of multiple trivial names. There is still some dispute about how many of these to use in the literature. However, this reflects the importance and prevalence of some aromatic compounds. Also--and here's the good news--I will accept either name.

Common compounds, with "official IUPAC" and "unofficial IUPAC" names:

Unofficial IUPAC

Official IUPAC

Structure

Benzene

Benzene

Toluene

Methylbenzene

Phenol

Benzenol

Aniline

Benzenamine

Benzoic Acid

Benzenecarboxylic Acid

Numbering the Parent

Numbering the parent works just as it does in other cyclic compounds (because we are limiting our discussion to single-ring compounds). Believe it or not, numbering of multi-ring compounds gets incredibly complex. We will avoid that at this level. 

Identify Substituents and Locate Them

Look at the carbon chain of the parent (examining only the carbons). The substituents are anything that is not actually part of that group. For most compounds, that includes a lot of hydrogens, bonded directly to the parent chain carbons. Since these are the simplest possible substituents, they are never named in the compound--they are assumed to be there unless you specifically say that it's something else. This is a common point of confusion: many students assume that if you can see it, you must explicitly include it in the name. However, hydrogen is treated differently from all other substituents because to explicitly include it would be unworkable. Notice how the example above only has two substituents named, and none of the hydrogens are mentioned directly.

Naming Substituents

Substituents are named almost exactly as main chains are, with three exceptions: first, the name ends in "yl"; second, secondary suffixes are never used; third, the numbering (if necessary) starts at the point of attachment to the parent. If numbering or complexity might cause confusion, the whole substituent name can be enclosed in parentheses. The point of choosing a big parent is to avoid this complexity.

Locating Substituents on the Parent

Each and every substituent is given a number, based on where it is attached to the parent. Every one of these numbers should appear in the name.

When there are multiple copies of one substituent type (a methyl, for example), all examples are aggregated into one word, using the number prefixes shown above. For example, two methyls would be named "dimethyl", and four ketone substituents would be "tetraoxo". There should be a locant for each group; hence there would be two locants in front of the dimethyl example, and four numbers in front of the tetraoxo example: 3,4,6,12-tetraoxo....  

Special Substituents

The substituents on esters and amides are special. Esters always, and amides sometimes contain "R" groups attached to the oxygen or nitrogen. These are named separately, and the word is separated from the rest of the name by a space. For example, one names the methyl ester of isobutyric acid as methyl 2-methylpropanoate, not O,2-dimethyl propanoate, or whatever.

Stereochemistry

 Stereochemistry is properly defined by either R or S for stereogenic centers, or by E or Z for double bonds. Each stereogenic center can be located by number, as can each double bond. A stereogenic carbon with the R absolute configuration at position 6 in a chain would be designated 6R. The description is written in italics, and surrounded by parentheses, to reduce confusion.

Assemble Name

 It's not a bad idea to think about assembling a name from back to front. If you speak German, this will be a natural assembly for you, and in fact nomenclature probably reflects the important historical significance that German chemists have in organic chemistry.

The last thing you see in a name is the highest priority functional group identifier, also know as the secondary suffix. Just before that is the parent name. Before that is a list of substituents, in alphabetical order, each of which is grouped with their locator numbers by a hyphen. Before that is the "special substituent". And before that, the leader of the pack, is the stereochemistry designation for all stereogenic centers and double bonds, all within one big pair of parentheses.