I have limited access to the MySQL database, I just can see a view called customer contains customer_id, name, and their location
id_customer name location
1 Andy Detro.it
2 Ben CALiforNIA
3 Mark uk
4 Niels London123
5 Pierre Paris
And a table called location contain list of city and country of customer location.
id_coutry country id_city city
1 US 1 Detroit
1 US 2 California
2 UK 3 London
2 UK 4 Manchester
I wants to clean customer data automatically if there is new data in the database, I mean if in the raw data there is punctuation or number or typo, it will automatically clean and then after that the clean location will search their id_city based on location table, if there is no city similar/match, it will search id_country, and if there is no the id_city/country will be 0. and it will become new table called customer location
id_customer id_city status
1 1 Match
2 2 Match
3 2 Country
4 3 Match
5 0 Unknown
The status is label if the location is from city then it will be Match, if it's from country it will be Country, if there is no similar name or id_city/country 0 it will be unknown. The location can be city or country so the status will tell it's match with the city or with the country.
Can someone sugest what I must to do this project, I try to do it with python in jupyter notebook but will it be effective for this case? I really new to this things, sorry if I can't give enough information and thanks before.
from Suggestion for Automatically Match Data in MysQL
No comments:
Post a Comment