Dataset Creation and Curation - Christiaan Swart
DataTalks.Club - En podcast af DataTalks.Club
Kategorier:
We talked about: Christiaan’s background Usual ways of collecting and curating data Getting the buy-in from experts and executives Starting an annotation booklet Pre-labeling Dataset collection Human level baseline and feedback Using the annotation booklet to boost annotation productivity Putting yourself in the shoes of annotators (and measuring performance) Active learning Distance supervision Weak labeling Dataset collection in career positioning and project portfolios IPython widgets GDPR compliance and non-English NLP Finding Christiaan online Links: My personal blog: https://useml.net/ Comtura, my company: https://comtura.ai/ LI: https://www.linkedin.com/in/christiaan-swart-51a68967/ Twitter: https://twitter.com/swartchris8/ ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html