XaiJu
beestat
beestat

patreon


Pre-screening for bad addresses

Home comparisons are a really neat feature of beestat; they let you see how your home stacks up against other similar homes in your geographical region. In order to do that, I need to know where you live. The good news is that ecobee provides this data in their API. The bad news is that the data isn't very consistent. Here are some examples of data I receive:

123 Main St, Atlanta, GA 30060, USA
123 Main St, Atlanta, Georgia 30060, United States

Any time I see inconsistent data like this I immediately decide that it cannot be trusted. I display this data in beestat and also look up latitude/longitude which is used to quickly look up other homes in your region so I need to be sure it's 100% accurate.

This is where an address validation service comes in handy. I use SmartyStreets. I can send them all sorts of messy data and they will standardize everything, geocode it, and send me all the beautiful data back. This works for both US and international addresses.

SmartyStreets has a reasonable free tier. I can do 250 US lookups for free and 100 international lookups for $7/mo. This has worked decently, but a lot of people use beestat and the numbers start to stack up:

This is my US subscription usage. My startup quota (60,000) is expired and I'm now hitting my cap. When this happens beestat just fails to get accurate address information. This breaks home comparisons (sort of) for new users until the quota is refreshed at which point everything fixes itself automatically...until I run out again.

Here's my outbound API call log. The blue one is SmartyStreets...it's hugely bloated right now because the background sync continually sends API calls even though I just get a response saying my quota is up.

To fix this, I could just pay SmartyStreets, but their minimum plan is $50/mo which is way overkill. Instead I decided to do a bit of streamlining to see if I could get the volume low enough.

The first step was to identify any unnecessary API calls. I looked through my logs and found that many ecobee owners are entering invalid address data...leaving out fields or not entering a house number. Without this data the SmartyStreets API call just fails, but it still counts against my quota. The logic I ended up implementing was simple enough: Require all address fields to be populated and require the street address to contain a number followed by some amount of text.

My initial estimates put this at a roughly 8% reduction in address lookups. I'm not sure this is enough, so I may end up switching to a different provider with a more affordable plan.

Also, I do aggressively cache these lookups. Every request I send is cached forever, and every request checks the cache before sending to avoid duplication. So things like "123 Main St." won't send if I already looked up "123  main st". This helps considerably and has been around since the beginning.

The next thing to clean up is MailChimp. I currently pay them $30/mo because my subscriber count is past their free tier. Once I sort out address validation I'll be moving to get rid of MailChimp so look for that post next.

Comments

In the app it's whatever you have configured as your home. Not sure at the moment how it presents on the thermostat.

This is the address found in the ecobee profile setting?

That's not a bad idea. I do show "Unknown Address" or something on the Comparisons page but it's easy to miss or not know what to do with.

Why not notify people who have bad addresses and give them a chance to correct it themselves?


More Creators