Main Links
VIP Home
Who is VIP?
National Activity
How Can I Help?
Interesting Sites

Pilot Voter Registration Cleanup Program Preliminary Report

April 1, 2000

Published by
The Voting Integrity Project
P.O. Box 6470
Arlington, VA 22206-0470
(888) 578-4343


The Voting Integrity Project ("VIP") is a national, non-partisan, 501(c)(3) voting rights organization that has focused public attention on election integrity issues. VIP litigates voter rights issues and investigates failed public elections, but its primary mission is to educate and equipment Americans to protect election integrity in their own community. VIP trains citizens poll watchers for this role. A key component of citizen poll watching is access to clean voter registration records. VIP provides its citizen poll watchers with specially created lists of voter registration records that have been matched against other available public data. It was that which led to the desire to research how to more thoroughly identify fraudulent or no longer qualified registrants on America's voter rolls.

This is a preliminary report on a pilot project of VIP to assist small communities in cleaning up their registration records. More detailed analysis and recommendations will be contained in a final report anticipated in Summer 2000.

This document serves two purposes: to give a preliminary report on the results of Phase One of the data registration clean up project. It is also meant to serve as a manual for citizens and officials who would like to better understand how such registration cleanup can be conducted.

This report details various approaches to identifying voters whose registration may be invalid or whose actual votes may be invalid. The steps undertaken are replicable in other jurisdictions subject to availability of data and technical capability. All of the data utilized is available to the public and at no time was data used that violated the privacy of any individual.

This approach is not meant to replace the more in-depth services available commercially from DBT Online, for example, which offers data matching capabilities to a much finer level of detail and certainty. Since DBT Online's services may be cost prohibitive for smaller communities, the approaches and methods outlined in this report may offer a practical alternative.

Registration cleanup must be undertaken as an on-going project. Church levels in voting registration records today is extremely high. As the report details, obtaining data in the short period between when new registration ceases and an election is conducted (usually 30 days) is a challenge. Most jurisdictions see a surprising amount of new registrations immediately prior to elections, so that registration clean up conducted too far in advance of an election would not address these new registrations and potential fraudulent registrations.

However, on-going cleanup programs, conducted in cooperation with election and party officials, and with maximum involvement of citizen poll watchers, can be extremely effective in keeping voter rolls clean.

VIP would like to thank DBT Online for its support of The Voting Integrity Project's endeavors to assist smaller communities in cleaning up voter registration records and increase public confidence in election outcomes.


Implementation of the National Voter Registration Act beginning in 1995 created new challenges for voting jurisdictions across the country. The Act, commonly referred to as "Motor Voter" made voter registration so easy that many who are not qualified to vote have been able to register to vote and vote without challenge. At the same time, automatic purges of inactive voters were done away with and a more cumbersome system for removal of "deadwood" voters was substituted. The combination of these two factors has meant that a significant number of unqualified or no-longer-qualified voters is building up on registration records across America.

The Voting Integrity Project ("VIP") began receiving numerous requests for assistance in cleaning up voter registrations records the first year of its operation (1996). However, as the magnitude of the problem has created a ready source pool for election fraud, citizens and election officials have focused on strategies for record clean up and maintenance.

The VIP Governing Board approved a pilot projects in 1999 for two different jurisdictions with different needs, in order to study the procedures and resources available for voter registration records clean up and maintenance. The ultimate goal of this study is this document, which it is hoped will provide guidance to small communities and citizen groups seeking to cleanse their voter registration records of bogus and deadwood voters. Even without removal, VIP has found that the mere identification of such records combined with implementation of a citizen poll watching program, can benefit election integrity by providing the "probable cause" for citizens to challenge potentially fraudulent voters on the basis of "identity" or "residence."

Two jurisdiction were used as a model for this document. The processes, procedures, successes and failures of these projects are discussed herein. In addition, material gleaned from a statewide investigation still underway in which immigration records have been sought, will be incorporated in the final report, once the investigation has been completed.

The first jurisdiction undertaken was Fayette County Pennsylvania, at the request of the Fayette County Commission. The County had undergone a wrenching grand jury investigation of absentee ballot fraud which had resulted in three indictments, including that of a former U.S. Congressman. The County sought to identify those voters on its registration records who were fraudulently registered or who were no longer qualified to vote for various reasons. Although the County had obtained a proposal from a commercial data matching service to perform basic data matching services of its voting registration records, the cost quoted was considered prohibitive.

The goal of the Fayette County project was to identify all available resources and approaches that might be used to validate the registration of all voters in the county. The approach used in this project was to acquire and then match the information in those resources to the county's registered voter file. Each voter record was scored to indicate positive and negative results for each resource processed. After all resources were processed, scores were totaled and listed by most to least negative. The list will be used by county officials to further research the qualifications of voters whose registrations are most likely invalid.

The second jurisdiction was Atlantic Beach, North Carolina, a resort community whose year-round resident population swelled for the tourist season. A citizen group concerned about election integrity believed there was a potentially serious problem of out-of-town vacationing voters. The problem of voters who are dually (and even more often) registered in multiple jurisdictions in the United States is a significant one. VIP ranks duplicate registrations as the second most frequent complaint it receives on its election fraud hotline and over its website. There is no national repository of voter registrations, making it possible to register in several states at once, and even vote in multiple states via mail-in absentee ballots. It is made easier still in jurisdictions such as North Carolina, which do not have state-centrally maintained voter records. Each county maintains its own records and in its own format. Although lists are available on the Internet on a county by county basis, it is cumbersome to use in any meaningful way as it must be searched one name at a time.

In Atlantic Beach, therefore, the goal was different from Fayette County, although both required the same fundamental steps. The possibility of vacation home property owners being registered in both the resort community and the jurisdiction of their other residence seemed credible. Initial data supplied by the citizen group indicated the problem could be significant. The community sponsors of this project identified surrounding counties where it was felt the majority of owners might have other residences. Registration files from these counties were obtained. The registration file from the resort community was then matched against the neighboring counties and potential multiple registrants were identified and further researched.


National Change of Address (NCOA) processing is based upon the United States Postal Service's records of valid addresses. You will be asked to sign a use statement before receipt of the data, so be sure that you review the statement and potential use carefully. The NCOA process is a valuable resource used by organizations that do mass mailings. The intent is to validate and standardize the address of all records on a file submitted to an NCOA processor. There are a limited number of NCOA processors approved by the United States Postal Service to perform this work. Procedures have been put in place that guarantee that the most current information is provided. The processors are required to produce results for their clients within a relatively short period of time. Normally, results are available within a week after an NCOA processor receives the file to be processed.

A list of current authorized NCOA processors is provided below. The current list is available on the Internet at

PO BOX 2000 ***
CONWAY, AR 72033-2000 ****
VOICE: 501-336-3807
FAX: 501-336-3715
1900 NEW HWY ***
FARMINGDALE, NY 11735-1537 ****
VOICE: 516-293-6100 FAX: 516-293-0891
LOUISVILLE, CO 80027-2452
VOICE: 303-666-7000
FAX: 303-666-3887
HILLSIDE, IL 60162-2039 ****
VOICE: 708-236-2438
FAX: 612-541-6525
WOODCLIFF LAKE, NJ 07675-7679 ****
VOICE: 201-476-2312
5775 WAYZATA BLVD STE 560 ***
MINNEAPOLIS, MN 55416-1208
VOICE: 612-541-6523
FAX: 612-541-6525
5884 POINT WEST DR ***
HOUSTON TX 77036-2612 VOICE: 713-995-2200
FAX: 713-995-2328
ALPHARETTA GA 30005-8884
VOICE: 770-740-4369
FAX: 770-740-5863
SCHAUMBURG IL 60173-4998
VOICE: 847-517-5683
FAX: 847-517-5189
VOICE: 714-476-1212
FAX: 714-476-1001
133 NW 122ND ST
VOICE: 405-749-7414
FAX: 405-752-9341
BILLERICA MA 01821-3961
VOICE: 978-671-6067
FAX: 978-663-2553
GLEN BURNIE MD 21060-6401
VOICE: 410-412-1598
CLIFTON NJ 07012-1694
VOICE: 973-614-3402
FAX: 973-779-5176
1501 OPUS PL STE 100 ***
DOWNERS GROVE IL 60515-5727 ****
VOICE: 630-719-0577
FAX: 630-971-4866
VOICE: 516-851-5000
VOICE: 301-459-9700
SOUTHFIELD MI 48034-8455 ****
VOICE: 248-728-7618
FAX: 248-728-6848
TAMPA FL 33609-2700
VOICE: 813-554-2031
FAX: 813-878-6475
NOVATO CA 94949-5798
VOICE: 415-382-7108


REVISED: 12/08/1999

= *
= **
= ***
= ****
= *****

Some NCOA processors provide additional functionality. We chose Donnelley Marketing for our projects. They maintain another database made up of subscription address changes, credit card address changes, etc. Donnelley Marketing told us that approximately 30% of the people who move do not file a change of address form with the United States Postal Service. However, most of these "unreported" address changes can be found via Donnelley Marketing's optional Donnelley Marketing Change of Address (DMCOA) processing.

The client who has NCOA processing performed typically has a choice of two output formats. First, is the option to change the original data with corrected addresses. Second, is to have NCOA results appended to the original record. For our purposes the latter was preferable.

It is our opinion that NCOA results have provided some of the best information on registered voters in the two jurisdictions. Information which we gleaned from the process include:

  • 1. Registrants with undeliverable mail addresses
  • 2. Registrants who have moved out of the jurisdiction
  • 3. Registrants whose registration address is not in the jurisdiction
  • 4. Registrants who have moved with no forwarding address available
  • 5. Registrants whose PO Box has been closed

In our two projects 8.6% and 20.6% of the total registrants fell into one of the above categories. Of the records that fell into one of these potential problem areas, the greatest number was type 2 (moved out of jurisdiction). The effective date of the move is provided by the NCOA process and it is a relatively easy procedure to determine if a vote was made after a registrant moved out of the jurisdiction. It is also possible that a voter has moved out of the jurisdiction but continues to vote in the jurisdiction legally. This is permitted, for instance, in the case of students and military personnel.

Probably the most interesting result was number 3. Although we have not resolved these errors completely, it appears that people have registered in one jurisdiction when they actually live in another. This seems to happen when a jurisdictional boundary dissects a town. This might result from a registrant's confusion as to which jurisdiction is correct for his residence. The real error in this situation may lie with the registration process and procedures and not with the registrant.

It should be noted that NCOA processing is geared toward a large volume of data. The NCOA processors have minimum charges that become very expensive when there is a relatively small volume of records. In our projects, we combined the registered voter files from both projects into a single file. This resulted in a file just under 90,000 records. We paid the minimum amount for the NCOA process. Had we not combined the two files, we would have paid a minimum charge for each file/project, thus doubling our cost. There are some "spin-off" companies who specialize in NCOA processing for smaller volumes. Their approach is to combine many small client files into one large file. They then submit the large file to an NCOA processor and break the resultant files back down to the original component files. This seems to be a very affordable approach for those who cannot meet the minimum record counts of the NCOA processors and who are only interested in standard processing options.


In the case Atlantic Beach, the main approach was to secure the files of the surrounding jurisdictions and compare the two. We did this in addition to the NCOA process described earlier. Many states do not have centralized registered voter files. That is, each jurisdiction maintains its own records and procedures. Typically, a jurisdiction, as defined in this context, is a county or an incorporated city. This project was with such a state. The citizen group VIP worked with defined the eight counties they felt were most likely to be the "other" jurisdictions for registered voters who might also be registered elsewhere.

A procedure to match the resort community's registered voter file to those of neighboring counties seems, on the surface, to be relatively easy. We found more complexities than expected. This was caused by the decentralized nature of the voter files. Each county to which we were to match had recorded their data in a unique format. Further, we found that the media available to us included 3 " diskettes, 5" diskettes, a download from the Internet and magnetic tapes. We chose the Internet download from the one county where this option was available. This was the only county where we incurred no acquisition cost. Our next choice was to use 3 ' diskettes if available. This proved to be available for five of the counties. The only choice available from two counties was magnetic tape. Since we had no magnetic tape capability, we had to have these two files copied to more useable media. It is interesting to note that neither CDs nor Zip disks were available from any of the eight counties. Cost per county ranged from zero, in the case of the download, to $43.00 for the 9 track magnetic tape. The $43.00 charge included a physical tape.

Once we obtained the files, the next step was to reformat the records into a layout that would allow us to perform a match. What we found is that the only data that was practical for a match was first name, middle initial, last name and date of birth. We had to use middle initial instead of middle name because many jurisdictions do not record the complete middle name. One large file was created from the eight counties. It contained the appropriate matching fields, address information and the name of the county from which it originated. A program was written that performed the desired process and listed records that did indeed match. Once the list was available, we manually looked up the particular records and checked voting history and registration date. To our surprise, one of the counties does not provide voting history on their file. Two of the counties do not provide registration date. Once we analyzed our results, we concluded that there was not a severe problem as suspected by the resort community. There were duplicates, however, these seem to have happened due to a purge procedure that has not removed voters who have moved. We found a few instances where a voter voted in both jurisdictions. We also found some instances where a voter appears to be voting in one jurisdiction or the other. These situations need to be checked further. They represent about 1 % of the registered voter file.

We found that names, addresses and phone numbers of board of elections are readily available on the Internet. We searched for "Board of Elections" combined with the state with which we were concerned. We readily found all of the information necessary to contact the appropriate person and place an order for registered voter files. Although we did not try, we are sure that dialing an information operator would also produce good results.

Based on our research, there is a very real opportunity for a person to both register and vote in multiple jurisdictions. This potential for abuse could be significantly reduced by statewide, centralized maintenance of registered voter records. Obviously, this would not preclude abuse by voters registered in multiple states. Another suggestion is to require social security number in all registration records nationwide. This would allow citizens and boards of elections to identify potential problems in a much easier manner. The only matching criteria available with today's data are name and date of birth. This is far from fail-proof can result in many false matches.


We were not able to acquire records of drivers licenses within the required time frame. We found that the motor vehicles department we queried was not at all cooperative until a concerned elected official became involved. The approximate cost to acquire the file of licensed drivers for one county was quoted at $350.

MY EDITORIAL - If we had processed this file, the results could only be used as a positive indicator that a person lives at a particular address within the county. The fact that a voter does not appear on the county's licensed driver file cannot necessarily be interpreted as a negative indicator. Thus, these files may not be cost-effective.


We received the property tax rolls file for Fayette County. A match was performed from the voter files to the tax rolls file. We matched on last name and the first six characters of address. The results were surprising. Less than 50% of the voters matched to the tax rolls file. About 25% of the voters' addresses matched to the tax rolls file, but the owner's name was not that of the voter. This was not unexpected and would seem to result from renters, people of different last names living with the owner, etc. The surprise came when we found that about 25% of the addresses on the voters file did not match any address on the tax rolls file. We have found that a few of these situations result from abbreviations of street name, i.e. Market St vs. Mkt St. Others were caused by missing house numbers on the voter file, i.e. Market St on the voter file and 123 Market St on the tax rolls file. However, we have not yet identified the reasons for the bulk of these address mismatches.


We were unable to acquire death records within the required time frame. The file was only available if an official request was made by the county commissioners. A private individual has no opportunity to acquire this information. The approximate cost of 40 years of death records within this one county was quoted at $3000.

The failure to acquire this file was a major disappointment, since this data can provide some of the strongest negative indicators for this project.


We tried to purchase a current residential telephone listing from the local phone company. The cost of this purchase was quite high and did not seem to be worth the results that could be gained. Our opinion is that a match on a phone number listing would give us a slight positive indication, but nothing conclusive. A match might be found, but there would be no assurance that the address is the permanent residence. A non-match would certainly not indicate that the person does not live at the address recorded, especially when taking into account the large number of individuals who have unlisted phone numbers, do not have a phone or who only have a cellular phone. We concluded that this file should not be pursued in future projects, it contains little useful data for purposes of uncovering voter problems.

We tried to acquire the Social Security death records to no avail. We know that these records have been published by genealogical organizations and have been acquired by large organizations that do similar matching processing as ours. However, we met a brick wall and decided that the local Department of Vital Statistics Death records were probably a better source for this particular project.


Our matching logic, for the most part, consisted of last name, first name, middle initial and date of birth. We did not have a unique identifier such as social security number available to use. When matching the voter file to other files there are various situations that need to be considered. Among these are:

  1. One file may have middle initial and the other may have middle name. We truncated middle name in all cases to a single character.
  2. One file may have maiden name and the other may have married name. We found no solution to this problem. We think that the only real solution to this situation is a match on social security number.
  3. The spelling of the last name may be different. This might happen, for instance in the case of Mc Guire vs. McGuire. We eliminated all imbedded blanks in last names and forced all names to upper case before matches were performed. Other possibilities that we could not correct include different spellings such as MacGuire and McGuire or Young and Yung.
  4. First names may be different in the two files. For instance, Jenny vs. Jennifer or Mary vs. Maryanne. We made no attempt to identify and correct these differences. A possible approach is to match only on first letter of first names and first letter of middle names. The problem with this solution is that uniqueness becomes muddled and false matches begin appearing at a higher than acceptable rate. Consider the situation where there is a Roy T. Smith and a Robert T. Smith who were both born on 1/1/55. Uniqueness is maintained until we use only the first letter of the first name.
  5. As the universe gets larger, the odds of false matches increases. That is, there may only be one Robert T. Smith born on 1/1/55 in a community of 10,000. However, there would most likely be many in a community of 1,000,000. When using the techniques used in these benchmark projects, all matches must be considered "possible" matches. Nothing can be considered conclusive without further research.
  6. Confusion can be caused if one tries to incorporate a name suffix in the matching logic. Suffixes such as M.D., ESQ., PhD, Jr., Sr., II, III, IV, etc. are unreliable.
  7. Data entry errors can cause obvious matches to not occur. One character incorrectly entered in name or date of birth or any other fields used for matching will cause the match process to fail.
  8. Matching on addresses is full of potential problems. These include apartment number recorded as "APT 2", or "APT # 2" or APT #2" or simply "#2". Street names are sometimes spelled differently. For instance, Market vs Mkt. A common problem is the spelling of street suffix, i.e., Lane vs. Ln or Street vs. St. Some of these errors can be corrected by using a "pre-processing" routine that corrects common differences before matching logic is processed.
  9. Sometimes data are provided in upper case, other times in "proper" case. We forced all alphabetic data to upper case when performing a match.


As discussed earlier, there are concerns regarding computer media when ordering files for projects such as the two described in this document. The assumption throughout has been that the project is performed on an average PC. Our experience when ordering files is that the media available is sometimes not very compatible with an average PC. We encountered sites which had only 9 track magnetic tape, 3 " diskettes, 4mm data cartridges and Internet downloads as the available media. Others offered CDs, Zip disks as well as diskettes and tapes. Before placing an order for a file to be provided on media that is not useable on your PC, you should identify a source and the cost to have that file converted. We had great difficulty finding a source to convert the 4mm data cartridge, succeeding only after two days of effort.

File and record formats are other items that need to be looked at closely. Formats that we encountered included comma delimited ascii, fixed length ascii, fixed length EBCDIC, Excel and dbase. When ordering files, it is important to know what format is going to be provided. It is obviously also necessary to know what formats your particular software product can use (import). You could find yourself in a situation similar to the one described for media. That is, your might have a file that needs to be converted by an outside source before it is useable to your software.

A record layout, the document that describes the content of each field within a record, seems to be always provided when one orders a file. When performing a project similar to those defined herein, you need to define the matching logic that is most appropriate and practical. We used name and date of birth. Once you define the matching logic to be used, you must be sure that that data is available on the records in the file you are ordering. It would be almost useless to decide to match on name and date of birth and then find that date of birth is not provided in the records.

The most daunting task for the inexperienced database programmer might be accomplishing the re-format of the name fields into something that can be use for matching. Some files contain the name as a single field and others break the name out into separate fields (first, middle, last and suffix). In order to match, the names must be formatted so they are in one form or the other. A similar problem was encountered with dates. Some files contained dates as one field without century included, some as one field with century and others contained the component parts separate fields (month, day and year). Correcting this problem for a valid match is not as complex as correcting the name problem, however it does take some expertise.


Total Records 81,750

Total potential problems 7,043 8.6%
Nixies 341 .4%
Moved out of Fayette County 3,461 4.2%
Registration address not in F.C. 2,784 3.4%
Moved, left no forward 363 .4%
Moved out of USA 3 --
PO Box closed 91 .1%
Voted since moving out of Fayette County 512 .6%


The voter file was matched to the tax rolls file on last name and the first eight characters of address. The results were recorded and scored as follows:

Total records 81,750

Count Percent Score
Matches on name and address 40,193 49.2% 0
Address exists on tax file but name is different 20,980 25.7% 5
Voter address not on tax file 20,577 25.2% 2

Score 5 possibilities: Voter is renting; voter is married but didn't change last name to spouse's name; voter is living with owner.

Score 2 possibilities: Voter's address is not in Fayette County; spelling or abbreviation differences; house number used in one file and not the other; voter registered with PO Box as address

NOTE: Although scores were given to the tax match results, these scores were not included in the total score. Therefore, tax related problems are only listed on the large detail report when there are other possible problems with a voter's record.


Total Records 1,926

Total potential problems 396 20.6%
Nixies 2 .1%
Moved out of Atlantic Beach 305 15.8%
Registration address not 'A.B.' 81 4.2%
Moved, left no forward 5 .3%
Moved out of USA 1 --
PO Box closed 2 .1%
Voted since moving out of Atlantic Beach 10 .5%


The two projects that were used as benchmarks for this manual produced potential problems ranging from 8.5% to 12.5% of their total registered voter records. Each potential error was assigned a score. The scores were totaled and a report of all potential errors sorted from highest to lowest score. The resultant data reports will be turned over to the requesting entities in each jurisdiction, who have the ultimate responsibility of performing final determination on each potentially incorrect voter record.

In our quest to acquire files such as death records and licensed driver records, we found that the agencies we approached were friendly, but uncooperative. Phone calls were not returned. Secretaries politely refused to put us through to their bosses. We were told that "there are privacy issues" that prevented the information from being provided. We were told that there is "no procedure" to provide the requested information. Our successes came only after we asked the project sponsor to make a formal request. The project sponsors were government officials in the subject communities.

We were able to complete both projects at a cost of less than $4000.00 each. This included the cost of acquiring files, NCOA processing and the technical work. The cost did not vary significantly between the two projects, even though one contained more than thirty times the number of voter records than the other.

This project will undergo additional analysis, more delineation of process, and the development of recommendations.

In addition, model legislation may be developed that will seek to address (1) availability of data (2) uniformity of how records are kept and (3) centralization of voter records on a state and national basis.

With voter turnouts sink lower with each election, the number of elections with close margins is on the rise. This heightens the concerns that elections are easier than ever to corrupt, because it can take just a few votes to change the outcome. With that in mind, it is imperative that communities do all they can to keep accurate voter lists and participate in guarding the sanctity of the ballot box.

Copyright 1998 Voting Integrity Project. All rights reserved.
Questions or comments pertaining to this site? Email .