Large-Scale Entity Resolution - Sonal Goyal
DataTalks.Club - En podcast af DataTalks.Club
Kategorier:
We talked about: Sonal’s background How the idea for Zingg came about What Zingg is The difference between entity resolution and identity resolution How duplicate detection relates to entity resolution How Sonal decided to start working on Zingg How Zingg works What Zingg runs on Switching from consultancy to working on a new open source solution Why Zingg is open source Open source licensing Working on Zingg initially vs now Zingg’s current and future team Sonal’s biggest current challenge Avoiding problems with entity/identity resolution through database design Identity resolution vs basic joins, data fusions, and fuzzy joins Deterministic matching vs probabilistic machine learning Identity and entity resolution applications for fraud detection Graph algorithms vs classic ML in entity resolution Identity resolution success stories What Sonal would do differently given the chance to start over with Zingg Advice for those seeking to realize their own solution to a data problem Reading suggestion from Sonal Conclusion Links: Open-Source Spotlight demo "Zingg":https://www.youtube.com/watch?v=zOabyZxN9b0 Creative Selection: Inside Apple's Design Process During the Golden Age of Steve Jobs book: https://www.amazon.com/Creative-Selection-Inside-Apples-Process/dp/1250194466 ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html