I spent the past week visiting data partners in South America. It was a good week of sharing information about the needs of the market such as 2+2 Compliance checking and KYB (Know Your Business). Trips like this always provide us with additional learnings about how solutions work best and what are the best practices for getting the best match rates in countries like Brazil, Argentina, Chile and Columbia. In this posting I will cover a few things learned regarding Brazil and Argentina.
I started the trip in Brazil. While meeting with one of our key data providers we discussed The Maria Problem and how it can skew results. I wrote about the Maria Problem in the past when discussing Portuguese Identity Verification. It was not a surprise to learn that the same problem exists in Brazil given their cultural and language ties. The problem simply stated is that most women in Brazil have a first name as Maria. Many women don’t use the Maria in most applications and forms rather they use their second name to differentiate themselves. Most Brazilian (and Latin American) names consist of four names. A surname, a second surname a maternal name and a paternal name. In most data entry responses you will find that a second surname along with either the paternal and/or maternal name are used in combination. Doing a matching exercise on this type of information will yield varied results which is why it is either important to know how to tune your rules to compare on the combinations of first and second surname to try and achieve a match. Our local provider knows this problem and his systems are coding to address this as data passes through his system.
Argentina was the next stop on the trip. A new learning there relates more to data hygiene to produce a better match. In particular to Buenos Aries (the largest city in the country) the zip code in the past was a four digit number. Recently they have added and additional four digit alpha code onto the four digits to make the information more accurate due to the growth and density of the city. Most government databases now use the eight digit code versus the four digit code so, when na address is used as part of a identity check, it is likely that it wont validate against a government database without the address having gone through a hygiene process that enriches the additional four digit alpha code. Our local provider does this as part of their standard data normalization process and this yields a better match result.
In both countries a great deal of time was spent making sure that our data providers could provide a 2+2 check at a high impact match rate. While data hygiene is part of this equation having good data sources that are Government, Credit or Commercial is also important. The most important is to know how your input data will be best compared with the data sources. Simple things like the above will significantly impact the results of your electronic identity check.