Surely I jest, as messing with FIPS codes is rarely fun at all. However, it’s one of those things that you constantly have to deal with when mapping US demographic data, and it’s one of those things that GIS books and tutorials rarely discuss. And since I was tangling with them yesterday, I thought I’d share some tricks you can use in Excel to manipulate and format FIPS codes.
FIPS codes were created by the federal government to uniquely identify all geographic units in the US and are widely used. In an ideal world, here are some examples of FIPS codes for four counties:
State_Name, County_Name, State_FIPS, County_FIPS, FIPS
California, Alpine, 06, 003, 06003
California, San Mateo, 06, 081, 06081
Delaware, New Castle, 10, 003, 10003
Pennsylvania, Philadelphia, 42, 101, 42101
Each state code is unique, and each county code is unique WITHIN each state. If you are working with data for every county, then you will need to use the concatenated 5 digit FIPS in order to have unique ID numbers for each county. Otherwise, in this example, you wouldn’t be able to distinguish Alpine County, CA from New Castle County, DE because they both have the same county FIPS: 003.
In your database or spreadsheet, the codes need to be saved as text strings, NOT as numbers. In the spreadsheet I opened the other day, they were mistakenly saved as numbers. Excel also has an annoying habit of making corrections which aren’t really correct at all. So instead of the having the above, I had this:
State_Name, County_Name, State_FIPS, County_FIPS, FIPS
California, Alpine, 6, 3, 6003
California, San Mateo, 6, 81, 6081
Delaware, New Castle, 10, 3, 10003
Pennsylvania, Philadelphia, 42, 101, 42101
Since the codes were saved as numbers, all leading zeros were dropped. This is a problem, as I want to relate this data to data stored in other tables and shapefiles, where the FIPS codes are stored correctly. So, I needed to convert the data in this example back to strings (as join fields must have the same data type – you can’t join strings to numbers), and I had to get those zeros back. Somehow. Here’s how:
Open the data up in Excel, and convert the State_FIPS field to strings by selecting the column, selecting Format, and changing the format to text.
Then insert a column to the right of State_FIPS, and type in a formula:
=IF(Len(c2)=1,concatenate(0,c2),c2)
In English, this says: If the length of the value in c2 is one, then put a zero in front of the value in the new cell. Otherwise, just print the value of c2 in the cell. So in our example, the code for California would get converted from 6 to 06, while Delaware and Pennsylvania’s codes will just get reprinted, as they have the correct number of characters (two).
Then, it’s just a matter of copying and pasting this formula all the way down the rows, to create the correct two digit FIPS for each record. Lastly, select the whole column, copy it, go up to the Edit menu and select paste special for values only. This will overwrite all of the formulas in these cells and replace them with the actual output of the formula. You can follow these steps with the other two fields; the only thing that needs to change are the formulas.
Fixing the COUNTY_FIPS field is trickier, as we have three possibilities here: we’ll need to add two zeros if the code is one one character long, one zero if the code is two characters long, and nothing if the code is three characters long. Here’s the formula:
=IF(Len(e2)=3,e2,IF(Len(e2)=2,concatenate(0,e2),concatenate(“00″,e2)))
The parentheses get confusing, as arguments within each IF statement and each Len and concatenate function need their own parentheses. In English: if the value in e2 is three characters long, just print that value in the new cell. Otherwise, if the length is two characters, tack a zero to the front of that value and then print it. Otherwise, tack two zeros to the front of it and print. In that last piece, we’re making the implicit assumption that if it isn’t three characters long, and if it isn’t two, it must be one. The two zeros must go in parentheses so Excel reads it as a string. Without the parentheses, Excel assumes you mean a number and will “correct” you by dropping a zero – which is exactly what you DON’T WANT!
So in our example above, Alpine and New Castle get two zeros added to the front, San Mateo gets one, and Philadelphia gets none.
Fixing the last FIPS field is easy – it’s just a repeat of our first formula, as there are only two options: Add a zero to the front if it’s four characters long, otherwise just print the value (which is five characters long).
If you’re using a spreadsheet other than Excel, like Open Office’s Calc, you can use similar formulas with a few syntax changes. Calc uses colons instead of commas to separate values in arguments, so concatenate(0,e2) becomes concatenate (0:e2). You could also perform these operations in Microsoft Access using a Make Table query, with some adjustments. You would reference fields instead of cell values, and instead of the word concatenate, Access uses the syntax &””& in between fields that you want to join.
When I worked at the University of Washington Library, I created a tutorial for concatenating fields in Access, to supplement a tutorial my predecessor created for Excel. Both docs are available on WAGDA’s FAQ page.
Real fun, huh? ; )